EXERCISE 07: Using EXCEL to solve inverse problems.

An inverse problem is one in which we have a set of data which we think can be explained or modelled by an equation involving one or more parameters. Stated mathematically if we have data d(x) and a model m(x) where m(x)= f(p1,p2….pn) then find the p1…pn that best fit the data.

For example if we have data that look like they fit a straight line on average then the problem is to find the values of the coefficients a and b, where y = a + b*x, such that the difference between y(x) and the data d(x) is minimized according to some criterion (usually minimized with respect to the some of the squared differences).

The straight line example is probably the simplest example of an inverse problem. Other examples might be data that vary exponentially or sinusoidally.

 

7.1 Using the EXCEL regression procedure to fit straight lines to data

Math background

Let us suppose we have data (money in your savings account) for each month x. Let us suppose you deposit b dollars each month and you start with a dollars. Then you will have for each month an amount f(x) = y = a + b*x. Now let us suppose you get a little interest each month and you spend a little each month so that the monthly value now starts to have a random component.

Let us now suppose (after several years) that we want to find out how big the random component is and to verify that the account has been increasing linearly in general.

To do this we need to evaluate the slope of the data, the intercept at zero time, and the variance or standard deviation. These are defined as.

Slope= b= (sum(x*y)-sum(x)*sum(y)/n) / sum(x*x)-(sum(x))^2/n = CSCP/CSSX

intercept = sum(y)/n - b*sum(x)/n

variance = sigma^2 = (1/(n-2))* (CSSY-CSCP^2/CSSX) where CSSY= sum(x*x)-sum(x)^2/n

Generating data and getting statistics

Generate a linear function y = a+bx using what ever numbers you want

To this function add a random number generated in example 2 and scaled however you want. Example b1 = 100 + 50*a1 +random *10

Use the tools/data analysis/regression to generate the statistics and then carefully look through the statistics so you understand what the numbers mean.

 

7.2 Using least squares to fit data with a straight line.

Instead of using the EXCEL regression program to fit a line to the data we will now use an alternative least squares method to do the same job. There are four steps to this method;

Step1. Define or get a linear data set as in previous examples. We will call the data data(t)

Step2. Define a model that we want to fit to the data. In this case we want a model that represents a linear function, specifically model(t)= a + b*t

Step3. Define the value S= sum over all values of t of (data(t)-model(t))^2

Step4. Use the EXCEL SOLVER program to minimise S by varying the paramters "a" and "b"

This will produce estimates of a and b that give the best fitting straight line to the data.

7.3 Using least squares method to fit a model to exponentially varying data

Let us suppose we have data that clearly display an exponential variation. Call these data, data(t). We can generate an example of these data by using the relationship data(t)= 3*exp(.02*t) + random value , where t could be time in units of 1 month.

Given these data we now hypothesize that the data can be fitted with a model that has an exponential variation with time. We want to determine what the parameters of a best fitting exponential function are.

So we follow the same 4 steps as in example 7.1.

Step1. Define the data as just described.

Step2. Define the model to be model(t)=a*exp(b*t)

Step3. Define S to be the sum of squared differences between the data and the model

Step4. Use SOLVER to minimize S by varying "a"and "b"

7.4 Using least squares method to fit annual variation data with a cosine function

Let us suppose we have data that clearly display an annual variation. Call these data, data(t). We can generate an example of these data by using the relationship data(t)= 3*cos(2*pi*t/12+1.2) + random value , where t would be time in units of 1 month.

Given these data we now hypothesize that the data can be fitted with a model that has a sinusoidal variation with time. We want to determine what the parameters of a best fitting cosine function are.

So we follow the same 4 steps as in example 7.1.

Step1. Define the data as just described.

Step2. Define the model to be model(t)=a*cos(2*pi*f*t+c)

Step3. Define S to be the sum of squared differences between the data and the model

Step4. Use SOLVER to minimize S by varying "a", "f" and "c"