\ runge4.seq Runge-Kutta ODE Solver for systems of ODEs
\
\ Forth Scientific Library Algorithm #29
\ )runge_kutta4_init ( 'dsdt n -- )
\ Initialize to use function dsdt() for n equations,
\ its stack diagram is:
\ dsdt() ( 'u 'dudt -- ) ( F: t -- )
\ (the values in the array dudt get changed)
\
\ runge_kutta4_integrate() ( 'u steps -- ) ( F: t dt -- t' )
\ Integrate the equations STEPS time steps of size DT,
\ starting at time T and ending at time T'. U is the
\ initial condition on input and the new state of the
\ system on output.
\ runge_kutta4_done ( -- )
\ Release previously allocated space.
\ )rk4qc_init ( 'dsdt n 's -- ) ( F: maxstep eps -- )
\ Initialize to use function dsdt() for n equations.
\ The initial function values are in s{. The output is also
\ in s{. The result is computed with a 5th-order Runge-Kutta
\ routine with adaptive step size control. The step size
\ controller tries to keep the fractional error of any s{
\ component below eps. The maximum step size is limited to
\ maxstep .
\
\ rk4qc_done ( -- )
\ Release previously allocated space.
\
\ rk4qc_step ( -- flag ) ( F: step t -- step' t' )
\ Do one Runge-Kutta step, using adaptive step size control.
\ The flag is FALSE if the routine succeeds, else the step size
\ has become too small. The current step size and time are one
\ the fp stack and will be updated by the routine.
\ This is an ANS Forth program requiring:
\ 1. The Floating-Point word set
\ 2. Uses words 'Private:', 'Public:' and 'Reset_Search_Order'
\ to control visibility of internal code
\ 3. The word 'v:' to define function vectors and the
\ immediate 'defines' to set them.
\ 4. The immediate words 'use(' and '&' to get function addresses
\ 5. The words 'DARRAY' and '&!' to alias arrays.
\ 6. Uses '}malloc' and '}free' to allocate and release memory
\ for dynamic arrays ( 'DARRAY' ).
\ 7. The compilation of the test code is controlled by the VALUE
\ TEST-CODE? and the conditional compilation words in the
\ Programming-Tools wordset.
\ 8. To run the code the fp stack needs to be at least 5 deep.
\ To run all examples you need 7 positions on the fp stack.
\
\ (c) Copyright 1994 Everett F. Carter. Permission is granted
\ by the author to use this software for any application provided
\ this copyright notice is preserved.
\ (The adaptive code was contributed by Marcel Hendrix)
0 [IF] Additional documentation
(see Chapter 15. 'Integration of Ordinary Differential Equations', in
'Numerical Recipes in Pascal, The Art of Scientific Computing',
by William H. Press, Brian P. Flannery, Saul A. Teukolsky and William
T. Vetterling)
Problems involving ordinary differential equations (ODEs) can always be
reduced to the study of sets of first order differential equations. This is
done by introducing new variables that are usually derivatives of each other
and of the original variable. Example:
2
dy dy
--- + q(x) -- = r(x) can be rewritten to
2 dx
dx
dy/dx = z(x); dz/dx = r(x) - q(x) z(x).
The formula for traditional Euler integration is:
y[n+1] = y[n] + h*f'(x[n], y[n])
This advances a solution from x[n] to x[n+1] = x[n] + h. It only uses
information (the derivatives f') from the start of the interval.
When a trial step to the midpoint of the interval is done we can do better:
k1 = h*f'(x[n], y[n])
k2 = h*f'(x[n]+h/2, y[n]+k1/2)
y[n+1] = y[n] + k2 + O(h^3)
This is called the second-order Runge-Kutta or midpoint method (a method
is called of order n when its error term is of order O(h^n+1)).
By taking this idea a few steps further along we arrive at the fourth-order
Runge-Kutta method:
k1 = h*f'(x[n], y[n])
k2 = h*f'(x[n]+h/2, y[n]+k1/2)
k3 = h*f'(x[n]+h/2, y[n]+k2/2)
k4 = h*f'(x[n]+h/2, y[n]+k3)
y[n+1] = y[n] + k1/6 + k2/3 + k3/3 + k4/6 + O(h^5)
You'll find that the following Forth code is an exact implementation of these
5 formulas.
When is Runge-Kutta(4) useful? Of course only when the increased number
of function evaluations (f') is compensated for by the fact that larger
steps can be taken than for Euler or Runge-Kutta(2) integration. Usually
this is the case.
In general, higher-order RK methods are not interesting because the number
of evaluations increases faster than the gain produced by the larger
stepsize they allow.
Press et al highly recommend RK(4) in combination with an adaptive stepsize
algorithm ("Runge-Kutta is for ploughing the fields"). For smooth functions
they recommend the Bulirsch-Stoer method ("high-strung racehorse").
You should be aware of the fact that a RK(4) routine _without_ adaptive
stepsize control can be a waste of computer time. This is how you'd use it:
1. integrate the ODEs between point x1 and x2 with stepsize h.
2. integrate the ODEs between point x1 and x2 with stepsize h/2.
3. compare the results. If they're "sufficiently" equal, you're done,
else goto 2.
This approach is useful when you just want a low resolution function sketch.
(most people would even leave out steps 2 and 3).
The fixed grid is exactly what is needed for plotting, an adaptive stepsize
would require the use of an interpolation algorithm after RK(4) finishes!
[THEN]
CR .( RUNGE4.SEQ V1.2 15 November 1994 EFC )
Private:
v: dsdt() \ pointer to user function t, u, dudt
FLOAT DARRAY dum{ \ scratch space
FLOAT DARRAY dut{
FLOAT DARRAY ut{
FLOAT DARRAY dudt{
FVARIABLE h
FLOAT DARRAY u{ \ pointer to user array
0 VALUE dim
: F2/ 0.5E0 F* ;
: F2* 2.0E0 F* ;
Public:
: )runge_kutta4_init ( &dsdt n -- )
TO dim
defines dsdt()
& dum{ dim }malloc
malloc-fail? ABORT" runge_init failure (1) "
& dut{ dim }malloc
malloc-fail? ABORT" runge_init failure (2) "
& ut{ dim }malloc
malloc-fail? ABORT" runge_init failure (3) "
& dudt{ dim }malloc
malloc-fail? ABORT" runge_init failure (4) "
;
: runge_kutta4_done ( -- )
& dum{ }free
& dut{ }free
& ut{ }free
& dudt{ }free
;
Private:
: runge4_step ( -- ) ( F: t -- t' )
FDUP u{ dudt{ dsdt()
h F@ F2/
dim 0 DO
dudt{ I } F@ FOVER F* u{ I } F@ F+
ut{ I } F!
LOOP
FOVER F+ ut{ dut{ dsdt()
h F@ F2/
dim 0 DO
dut{ I } F@ FOVER F* u{ I } F@ F+
ut{ I } F!
LOOP
FOVER F+ ut{ dum{ dsdt()
h F@
dim 0 DO
dum{ I } F@ FOVER F* u{ I } F@ F+
ut{ I } F!
dum{ I } DUP F@ dut{ I } F@ F+ F!
LOOP
F+ \ tos is now t+dt
FDUP ut{ dut{ dsdt()
h F@ 6.0E0 F/
dim 0 DO
dudt{ I } F@ dut{ I } F@ F+
dum{ I } F@ F2* F+
FOVER F*
u{ I } DUP F@ F+ F!
LOOP
FDROP
;
Public:
: runge_kutta4_integrate() ( &u steps -- ) ( F: t dt -- t' )
SWAP & u{ &!
h F!
0 DO runge4_step LOOP
;
0 [IF] Adaptive Step Size Control For Runge-Kutta
The technique we'll be using is step doubling. We take each step twice,
once as a full step, then, independently, as two half steps. This creates
only about 38% of overhead because we'll get the accuracy of the half step.
A nice trick described by Press et al is that the results of the 3 operations
can be combined in a way that gives us a truncation error of order 6, not 5
as for the standard Runge-Kutta(4) (This is where the fcor factor is needed
for).
Given the results of Ringe-Kutta with the half and the full steps a new step
size is chosen:
h0 = h1 * |wanted_accuracy / delta1|^0.2
Here h stands for step size and delta1 is the maximum error of any component
of the output vector y, for the present step. If the step failed we know how
to choose the new step size h0, if the step was ok, we use h0 as the initial
step size for the next integration phase.
It is difficult to interpret "wanted_accuracy" correctly. When the dependent
variables differ enormously in amplitude you'd want to use fractional errors.
With oscillatory functions passing through zero it would be better to set
the wanted_accuracy to some part of the amplitude of these functions. Press
et al recommend an additional array uscal{ to do all this. One of the
arguments of the routine will be the vector of dependent variables at the
beginning of the proposed step. Call that u{ 1..n }. Let us require the user
to specify for each step another, corresponding, vector argument
uscal{ 1..n }, and also an overall tolerance level eps. Then the desired
accuracy for the ith equation will be
wanted_accuracy = eps * uscal{ i }
When constant fractional errors are desired you should plug y into the uscal{
calling slot. When constant absolute errors relative to some maximum values
are needed you set uscal{ i } equal to those values. A useful trick for getting
constant fractional errors except very near zero crossings is to set uscal{ i }
equal to |u{ i }| + |h * dsdt{ i }| . We'll use that in the code below.
Further refinements of the step size controller result in exponents being 0.2
or 0.25 and the use of a safety factor S of 0.9:
h0 = Sh1 * |wanted_accuracy / delta1|^0.2 wanted_accuracy >= delta1
= Sh1 * |wanted_accuracy / delta1|^0.25 wanted_accuracy < delta1
[THEN]
Private:
1E-30 FCONSTANT tiny
-0.20E0 FCONSTANT pgrow
-0.25E0 FCONSTANT pshrink
1e 15E0 F/ FCONSTANT fcor
0.9E0 FCONSTANT safety
4E0 safety F/
1E0 pgrow F/ F** FCONSTANT errcon
FVARIABLE eps
FVARIABLE step
FVARIABLE tstart
FVARIABLE maxstep
FLOAT DARRAY uorig{
FLOAT DARRAY u1{
FLOAT DARRAY u2{
FLOAT DARRAY uscal{
\ Find reasonable scaling values to decide when to shrink step size.
: scale'm ( -- )
tstart F@ uorig{ uscal{ dsdt()
dim 0 DO uscal{ I } DUP F@ step F@ F* FABS
uorig{ I } F@ FABS F+ tiny F+
F!
LOOP ;
\ With a trick the result of a step can be made accurate to 5th order.
: 4th->5th ( -- )
dim 0 DO \ get 5th order truncation error
uorig{ I } DUP F@
FDUP u1{ I } F@ F- fcor F*
F+ F!
LOOP ;
\ Test if the step size needs shrinking
: shrink? ( -- bool ) ( F: -- diff )
0.0E0 ( errmax )
dim 0 DO
uorig{ I } F@ u1{ I } F@ F-
uscal{ I } F@ F/ FABS FMAX
LOOP
eps F@ F/ FDUP 1e F> ;
Public:
\ Initialize to use function dsdt() for n equations. The initial function
\ values are in s{. The output is also in s{. The result is computed with a
\ 5th-order Runge-Kutta routine with adaptive step size control. The step size
\ controller tries to keep the fractional error of any s{ component below eps.
\ The maximum step size is limited to maxstep .
: )rk4qc_init ( 'dsdt n 'u -- ) ( F: maxstep eps -- )
eps F! maxstep F!
& uorig{ &!
)runge_kutta4_init
& u1{ dim }malloc malloc-fail? ABORT" )rk4qc_init :: malloc (1)"
& u2{ dim }malloc malloc-fail? ABORT" )rk4qc_init :: malloc (2)"
& uscal{ dim }malloc malloc-fail? ABORT" )rk4qc_init :: malloc (3)" ;
\ Release previously allocated space.
: rk4qc_done ( -- )
runge_kutta4_done
& u1{ }free
& u2{ }free
& uscal{ }free ;
\ Do one Runge-Kutta step, using adaptive step size control. The flag is
\ FALSE if the routine succeeds, else the step size has become too small.
\ The current step size and time are one the fp stack and will be updated
\ by the routine.
: rk4qc_step ( -- flag ) ( F: step t -- step' t' )
tstart F! step F! scale'm
uorig{ u1{ dim }fcopy \ we need a fresh start after a shrink
uorig{ u2{ dim }fcopy
BEGIN
uorig{ 2 tstart F@ step F@ F2/ runge_kutta4_integrate() FDROP
u1{ 1 tstart F@ step F@ runge_kutta4_integrate() ( F: -- t' )
FDUP tstart F@ 0.0E0 F~ IF 0.0E0 FSWAP FALSE EXIT THEN
shrink? \ maximum difference between these two tries
WHILE \ too large, shrink step size
FLN pshrink F* FEXP step F@ F* safety F* step F! FDROP
u2{ uorig{ dim }fcopy \ a fresh start after a shrink...
u2{ u1{ dim }fcopy
REPEAT \ ok, grow step size for next time
FDUP errcon F< IF FDROP step F@ 4e F*
ELSE FLN pgrow F* FEXP step F@ F* safety F*
THEN
maxstep F@ FMIN \ but don't grow excessively!
FSWAP TRUE 4th->5th ;
Reset_Search_Order
TEST-CODE? [IF] \ test code ==========================================
3 FLOAT ARRAY x{
: do_output ( n 'x -- ) ( F: t -- )
CR
F.
}fprint
;
FVARIABLE _dt \ With this time increment the dampled vibration test
1.0E-2 _dt F! \ will be in error in the 4th decimal place. Smaller
\ values will give more accurate results.
: dt _dt F@ ;
: dt! _dt F! ;
1.0E0 FATAN 8.0E0 F* FCONSTANT PI*2
1.92E0 FCONSTANT cm
960.0E0 FCONSTANT km
\ A damped vibration test case:
\ u'' + cm u' + km u = 0
\ or,
\ u' = v
\ v' = - cm v - km u
: dvderivs() ( 'u 'dudt -- ) ( F: t -- )
FDROP \ does not use t
OVER 1 } F@ DUP 0 } F!
OVER 1 } F@ cm F*
SWAP 0 } F@ km F* F+ FNEGATE
1 } F!
;
: actual ( -- ) ( F: t -- a ) \ just the U value not V for damped vib.
cm FOVER F* F2/ FNEGATE FEXP
FSWAP
cm cm F* 4.0E0 F/ FNEGATE km F+ FSQRT
F* FCOS
F* PI*2 F/
;
1 VALUE /steps \ Number of integration steps
\ done for a single output value
\ One would expect to get better results for high values of /steps, and
\ that is indeed the case.
: dvib_test_twice ( n -- ) \ n is the number of time steps to run
FRAME| a |
use( dvderivs() 2 )runge_kutta4_init
2 0 DO
1E PI*2 F/ x{ 0 } F!
cm FNEGATE F2/ PI*2 F/ x{ 1 } F!
CR 0.0E0 \ initial time
2 x{ FDUP do_output ." : " FDUP actual F.
a 0 DO
x{ /steps dt /steps S>F F/ runge_kutta4_integrate()
2 x{ FDUP do_output ." : " FDUP actual F.
LOOP FDROP
I 0= IF CR ." With twice as many steps now ..." THEN
/steps 2* TO /steps
LOOP
runge_kutta4_done
1 TO /steps
|FRAME
;
: dvib_test ( n -- ) \ n is the number of time steps to run
1.0E PI*2 F/ x{ 0 } F!
cm FNEGATE F2/ PI*2 F/ x{ 1 } F!
use( dvderivs() 2 )runge_kutta4_init
CR
0.0E0 \ initial time
2 x{ FDUP do_output FDUP actual F.
0 DO
x{ 1 dt runge_kutta4_integrate()
2 x{ FDUP do_output FDUP actual F.
LOOP
FDROP CR
runge_kutta4_done
;
16.0E0 FCONSTANT sig
45.92E0 FCONSTANT r
4.0E0 FCONSTANT bp
\ the Lorenz equations for chaos:
\ dx/dt = sig * (y - x)
\ dy/dt = r * x - y - x * z
\ dz/dt = -bp * z + x * y
: derivs() ( 'u 'dudt -- ) ( F: t -- )
FDROP \ does not use t
OVER DUP 1 } F@ 0 } F@ F- sig F*
DUP 0 } F!
OVER DUP DUP 2 } F@ FNEGATE r F+
0 } F@ F*
1 } F@ F-
DUP 1 } F!
SWAP DUP DUP 0 } F@ 1 } F@ F* 2 } F@ bp F* F-
2 } F!
;
: lorenz_test ( n -- ) \ n is the number of time steps to run
0.0E0 x{ 0 } F! 1.0E0 x{ 1 } F! 0.0E0 x{ 2 } F!
use( derivs() 3 )runge_kutta4_init
CR
0.0E0 \ initial time
3 x{ FDUP do_output
0 DO
x{ 1 dt runge_kutta4_integrate()
3 x{ FDUP do_output
LOOP
FDROP CR
runge_kutta4_done
;
\ The RC discharge equation: dVc/dt = 1/RC (Vin-Vc)
100E-3 FCONSTANT tau ( R, C product is 100 ms)
10E FCONSTANT Vin ( charging voltage source is 10 Volts)
: Cderiv() ( &u &dudt -- ) ( F: t -- )
FDROP
Vin SWAP 0 } F@ F- tau F/ 0 } F! ;
: Vc ( F: t -- v )
1e FSWAP FNEGATE tau F/ FEXP F- Vin F* ;
\ Example: 10 cap_test
: cap_test ( n -- ) \ n is the number of time steps to run
0E x{ 0 } F!
use( Cderiv() 1 )runge_kutta4_init
CR
0.0E0 \ initial time
1 x{ FDUP do_output ." " FDUP Vc F.
0 DO
x{ /steps dt /steps S>F F/ runge_kutta4_integrate()
1 x{ FDUP do_output ." " FDUP Vc F.
LOOP
FDROP CR runge_kutta4_done
1 TO /steps ;
: .output1 ( F: time -- )
CR FDUP F. 1 x{ }fprint ." | " Vc F. ;
\ The charging capacitor problem revisited.
\ Example: 200e-3 50e-3 1e-3 cap_test2
: cap_test2 ( F: tend maxstep eps -- ) \ tend is the time to stop
FROT FRAME| a | \ maxstep is time between outputs
0e x{ 0 } F!
use( Cderiv() 1 x{ FOVER FSWAP )rk4qc_init
0e \ initial step and begin
BEGIN
FDUP .output1
rk4qc_step 0=
FDUP a F> OR
UNTIL
.output1 FDROP rk4qc_done |FRAME ;
: .output2 ( F: time -- )
CR FDUP F. 2 x{ }fprint ." | " actual F. ;
\ The vibration problem again.
\ Example: 800e-3 50e-3 1e-3 dvib_test2
: dvib_test2 ( F: tend maxstep eps -- ) \ n is the number of time steps to run
FROT FRAME| a | \ maxstep is time between outputs
1E PI*2 F/ x{ 0 } F!
cm FNEGATE F2/ PI*2 F/ x{ 1 } F!
use( dvderivs() 2 x{ FOVER FSWAP )rk4qc_init
0e \ initial step and start time
BEGIN
FDUP .output2
rk4qc_step 0=
FDUP a F> OR
UNTIL
.output2 FDROP rk4qc_done |FRAME ;
[THEN]