Math Help Forum

Math Help Forum Feed Site Feed

Go Back   Math Help Forum > Math Resources > Mathematics Software Discussion
Reply
 
Thread Tools Display Modes
  #1  
Old October 26th, 2009, 02:41 PM
Newbie
 
Join Date: Oct 2009
Posts: 21
Country:
Thanks: 4
Thanked 3 Times in 3 Posts
jalko is on a distinguished road
Default Matlab - code optimalization problem

Hi there,

I have problem with optimizing my code, I am no programmer so can't really see the way to make that computation faster. It will take days to complete all I need from it. I searched the web and as I understand there is multithreading constantly allowed in newer matlab versions, but I still get only about 27% working of that c2q which is doing it. I also used profiler, but gave me nothing useful. And also tried to use Parallel Computing toolbox, but when I set
Code:
parfor b=1:n1
then error will show up. "??? Error: File: pohpriem.m Line: 33 Column: 9
The variable d in a parfor cannot be classified."

Can someone point me the way how to do this correctly or some other optimalization that code should have?

Here is the code:
Code:
function d=pohpriem(x, y)
matlabpool open local 2
[m1,n1]=size(x);
[m2, n2]=size(y);
c=m1-1;
i=0;
j=0;
for b=1:n1
    
    for l=0:c;
        for k=l+1:m1-1;
            A=regexp(y{k},'\.','split');
            D1=[str2num(A{1}) str2num(A{2}) str2num(A{3})];
            sD1= datenum(D1);
            B=regexp(y{k+1},'\.','split');
            D2=[str2num(B{1}) str2num(B{2}) str2num(B{3})];
            sD2=datenum(D2);
            
            if sD1 < sD2;
                break;
            else
                i=i+1;
                j=j+1;
                z(j,i)=log(x(m1-l,b))-log(x(m1-k,b));
            end
        end
    end
    [m3 n3]=size(z);
    for a=1:n3;
         l(a)=z(a,a);
        d(b,a)=l(a);
    end
    z=0;
    i=0;
    j=0;
end
d=d';
Thanks for reading!
Tom.
Reply With Quote
Advertisement
 
  #2  
Old October 26th, 2009, 05:53 PM
Member
 
Join Date: Mar 2007
Posts: 104
Country:
Thanks: 17
Thanked 14 Times in 14 Posts
elbarto is on a distinguished road
Default

Without knowing what exactly your code does and the format of the inputs, giving any advice will be difficult.

I would consider getting rid of as many foor loops as possible using MATLAB inbuilt functions. This can be quite a pain and it look like "y" is a cell array with make things a little more interesting. I would try use "cellfun", this has the potential to cut down on the for loops you need. An example is:

Code:
function A = mhfExample(y)
%use default values here as example
if nargin < 1;y = {'02.12.2009' '04.05.2009' '08.09.2009'};end

A = cellfun(@(x)strDate2Serial(x),y);
%A = A';%transpose results so they display as column vector
end

function serialDate = strDate2Serial(strdt)
tmp = regexp(strdt,'\.','split');
tmp = [str2double(tmp{1}) str2double(tmp{2}) str2double(tmp{3})];
serialDate = datenum(tmp);
end
which in you code corresponds (closely) to the block:
Code:
        for k=l+1:m1-1;
            A=regexp(y{k},'\.','split');
            D1=[str2num(A{1}) str2num(A{2}) str2num(A{3})];
            sD1= datenum(D1);
I would need to see some of your sample data before I can comment on whether this is faster or not. My function "strDate2Serial" could probably be written a little nicer aswell, I will have to have a think on how to optimise this some more (must be an easier way using regexp to convert to double).

Regards Elbarto
Reply With Quote
  #3  
Old October 26th, 2009, 10:21 PM
CaptainBlack's Avatar
Grand Panjandrum
 
Join Date: Nov 2005
Location: South of England
Posts: 11,375
Country:
Thanks: 667
Thanked 3,618 Times in 2,915 Posts
CaptainBlack has a reputation beyond reputeCaptainBlack has a reputation beyond reputeCaptainBlack has a reputation beyond reputeCaptainBlack has a reputation beyond reputeCaptainBlack has a reputation beyond reputeCaptainBlack has a reputation beyond reputeCaptainBlack has a reputation beyond reputeCaptainBlack has a reputation beyond reputeCaptainBlack has a reputation beyond reputeCaptainBlack has a reputation beyond reputeCaptainBlack has a reputation beyond repute
Default

Quote:
Originally Posted by jalko View Post
Hi there,

I have problem with optimizing my code, I am no programmer so can't really see the way to make that computation faster. It will take days to complete all I need from it. I searched the web and as I understand there is multithreading constantly allowed in newer matlab versions, but I still get only about 27% working of that c2q which is doing it. I also used profiler, but gave me nothing useful. And also tried to use Parallel Computing toolbox, but when I set
Code:
parfor b=1:n1
then error will show up. "??? Error: File: pohpriem.m Line: 33 Column: 9
The variable d in a parfor cannot be classified."

Can someone point me the way how to do this correctly or some other optimalization that code should have?

Here is the code:
Code:
function d=pohpriem(x, y)
matlabpool open local 2
[m1,n1]=size(x);
[m2, n2]=size(y);
c=m1-1;
i=0;
j=0;
for b=1:n1
    
    for l=0:c;
        for k=l+1:m1-1;
            A=regexp(y{k},'\.','split');
            D1=[str2num(A{1}) str2num(A{2}) str2num(A{3})];
            sD1= datenum(D1);
            B=regexp(y{k+1},'\.','split');
            D2=[str2num(B{1}) str2num(B{2}) str2num(B{3})];
            sD2=datenum(D2);
            
            if sD1 < sD2;
                break;
            else
                i=i+1;
                j=j+1;
                z(j,i)=log(x(m1-l,b))-log(x(m1-k,b));
            end
        end
    end
    [m3 n3]=size(z);
    for a=1:n3;
         l(a)=z(a,a);
        d(b,a)=l(a);
    end
    z=0;
    i=0;
    j=0;
end
d=d';
Thanks for reading!
Tom.
What is the purpose of z? If this is a local variable you are dynamically redimensioning it within the innermost loop. If possible compute its final dimensions before entering any loop and initialise it with zeros(dim1,dim2).

Also you appear only to use the diagonal elements of z (in fact as far as I can tell you are only computing these). If so only compute the diagonal elements and use a vector for them not an array.

CB
__________________
Truth does not change because it is, or is not, believed by a majority of the people.

Giordano Bruno
Reply With Quote
The following users thank CaptainBlack for this useful post:
Donate to MHF
  #4  
Old October 27th, 2009, 02:57 AM
Newbie
 
Join Date: Oct 2009
Posts: 21
Country:
Thanks: 4
Thanked 3 Times in 3 Posts
jalko is on a distinguished road
Default

Sorry for missing explanation, code is for computing log differences for every two values of every column of matrix x. In vector y are dates for constrain that combinations must lie only within one day.

data looks like this:
Code:
Date(vector y)| Value(matrix x)
1.1.2000| 1, 2
1.1.2000| 2, 3
1.1.2000| 5, 4
...
1.1.2009| 3, 5
2.1.2009| 8, 6
2.1.2009| 4, 7
...
Elbarto, I don't understand your code so much now, have to read help to implement it, but that "datenum" function from your another advice solved my first problem, so hope you are right again

Quote:
Originally Posted by CaptainBlack View Post
What is the purpose of z? If this is a local variable you are dynamically redimensioning it within the innermost loop. If possible compute its final dimensions before entering any loop and initialise it with zeros(dim1,dim2).

Also you appear only to use the diagonal elements of z (in fact as far as I can tell you are only computing these). If so only compute the diagonal elements and use a vector for them not an array.

CB
If I use for example data such
Code:
y,x
1.1.2000,1
1.1.2000,2
2.1.2000,3
2.1.2000,4
3.1.2000,5
3.1.2000,6
to matrix z will be stored every outcome for loops "l" and "k" (in this case, b=1) in diagonal places as you wrote(1OU->1. outcome, 2OU->second, 3OU->third)
Code:
1OU, 0, 0
0, 2OU, 0
0, 0, 3OU
I used this way because i need to store answer for every value of loop "l" and corresponding values of "k" and in this way, so for reasons that I can't make it store the value in first zero position in vector I have to use matrix way, I can use matrix z(l+1,k) instead, but for z(i,j) I easy know where the value is in matrix z and can distinguish zeros from answer from these matlab put to empty places in the matrix. Maybe there is some way to write to first zero position in vector and I can use it instead of that matrix, but I don't know how to do it.

Can compute dimensions for "z" before, thanks for tip, but as I assume, first need to compute all restrictions for dates and maybe create another dummy variable to can correctly compute combination number for every day and sum them to compute dimensions of z, so I am not sure that this will speed it up. But I will try.

Thanks for helping me out with this.
-------------------
I solved the problem to write in last zero position in vector, shame on me that i did not see it before. I still tried to find some function for that but just set
Code:
i=i+1
z(i)=..
was enough.
But i would still like to make it faster if it's possible somehow.

Last edited by jalko; October 27th, 2009 at 03:16 AM.
Reply With Quote
  #5  
Old October 27th, 2009, 05:52 AM
Newbie
 
Join Date: Oct 2009
Posts: 21
Country:
Thanks: 4
Thanked 3 Times in 3 Posts
jalko is on a distinguished road
Default

I changed code like so:
Code:
function d=pohpriem(x, y)
matlabpool open local 4;
[m1,n1]=size(x);
[m2, n2]=size(y);
c=m1-1;
%i=0;
%j=0;
p=0;
por=zeros(m2, 1)
parfor i=1:m2
    A=regexp(y{i},'\.','split');
    D1=[str2num(A{1}) str2num(A{2}) str2num(A{3})];
    sD1= datenum(D1);
    por(i)=sD1
    %B=regexp(y{k+1},'\.','split');
    %D2=[str2num(B{1}) str2num(B{2}) str2num(B{3})];
    %sD2=datenum(D2);
    
end
for b=1:n1
    
    for l=0:c;
        for k=l+1:m1-1;
            if por(k) < por(k+1);
                break;
                
            else
                p=p+1
                z(p,b)=log(x(m1-l,b))-log(x(m1-k,b));
            end
        end
    end
    p=0;
    
end

d=z;
It's quicker now, but still I want somehow quicker way if it is possible. When I tried to use parfor for loop in "b" for example I still have that error which I mentioned in 1. post.(The variable in a parfor cannot be classified.) Can you show me some more way to optimize this code?
Reply With Quote
  #6  
Old October 27th, 2009, 05:54 AM
Member
 
Join Date: Mar 2007
Posts: 104
Country:
Thanks: 17
Thanked 14 Times in 14 Posts
elbarto is on a distinguished road
Default

jalko,
can you please confirm if I understand this correctly.

1) find all dates within 1 day of each other
2) compute ALL the combinations for difference in values for all these dates. If this is the case I am unsure why there are 2 columns in the x vector. If you can explain this line that would be great:
Code:
z(j,i)=log(x(m1-l,b))-log(x(m1-k,b));
Also if you can explain simply what the output should contain, ie just differences of dates with differences that would be useful. You have got an interesting little problem here by the looks of it.

Im not sure using cellfun will be any faster as I haven't bench marked it yet. Would be interesting to see with a large data set. To understand my previous post look into "anonymous functions". They are quite usful, you can get by with out them usually but if your going to be doing a bit more MATLAB they do make life easier in certain applications.

Elbarto
Reply With Quote
  #7  
Old October 27th, 2009, 08:06 AM
Newbie
 
Join Date: Oct 2009
Posts: 21
Country:
Thanks: 4
Thanked 3 Times in 3 Posts
jalko is on a distinguished road
Default

1) and 2) yes.
two dimensions in z was there because I have 2 "for" cycles, one "for" for starting date and the second "for" for all remaining combinations whitin current day with that value from first "for". Because as was pointed here this cause a lot of problems i changed code to this:
Code:
function d=pohpriem(x, y)
matlabpool close
matlabpool open local 4;
[m1,n1]=size(x);
[m2, n2]=size(y);
c=m1-1;
r=0;
p=0;
q=0;
por=zeros(m2, 1);
parfor i=1:m2
    A=regexp(y{i},'\.','split');
    D1=[str2num(A{1}) str2num(A{2}) str2num(A{3})];
    sD1= datenum(D1);
    por(i)=sD1
end

for j=1:m2-1;
    q=q+1;
    if por(j)~=por(j+1);
            
        h=nchoosek(q, 2);
        r=r+h;
      q=0;
      end        
end
h=nchoosek(q+1, 2);
r=r+h
z=zeros(r,n1);
for b=1:n1
    
    for l=0:c;
        for k=l+1:m1-1;
            if por(k) < por(k+1);
                break;
                
            else
                p=p+1;
                z(p,b)=log(x(m1-l,b))-log(x(m1-k,b));
            end
        end
    end
    p=0;
       
end

d=z;
My output is value of all these combinations of differences and because I have observations in milions, it's taking quite a lot of time to compute. I optimized coinstrain checking proces(a bit), now I want to use parfor for loop in "b" or "l" but still got error that this cannot be used due to way that"z" is ussed.

Last edited by jalko; October 27th, 2009 at 09:52 AM.
Reply With Quote
  #8  
Old October 27th, 2009, 08:18 AM
Member
 
Join Date: Mar 2007
Posts: 104
Country:
Thanks: 17
Thanked 14 Times in 14 Posts
elbarto is on a distinguished road
Default

Let me sleep on that one jalko, I will have a look tomorrow. I have no experience with parallel computing so im not the best one to answer that question but I have a few ideas I would like to try. I think I can cut the use of for loops down at least so I will have a go tommorow and post up a quick example to see if I understand correctly what you are trying to do.

Captain Black seems to have a lot of MATLAB experience so he might be more usful on the high performance side of things.

Regards Elbarto
Reply With Quote
The following users thank elbarto for this useful post:
Donate to MHF
  #9  
Old October 27th, 2009, 08:35 AM
Newbie
 
Join Date: Oct 2009
Posts: 21
Country:
Thanks: 4
Thanked 3 Times in 3 Posts
jalko is on a distinguished road
Default

Quote:
Originally Posted by elbarto View Post
Let me sleep on that one jalko, I will have a look tomorrow. I have no experience with parallel computing so im not the best one to answer that question but I have a few ideas I would like to try. I think I can cut the use of for loops down at least so I will have a go tommorow and post up a quick example to see if I understand correctly what you are trying to do.

Captain Black seems to have a lot of MATLAB experience so he might be more usful on the high performance side of things.

Regards Elbarto
No problem, sleep well mate.
Reply With Quote
  #10  
Old October 27th, 2009, 09:55 AM
Newbie
 
Join Date: Oct 2009
Posts: 21
Country:
Thanks: 4
Thanked 3 Times in 3 Posts
jalko is on a distinguished road
Default

Interesting, I noticed that if echo to command window is not allowed for writing just value of p, computation is dramatically faster.
Reply With Quote
Reply

Tags
matlab, optimalization, parallel computing

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off
Forum Jump


All times are GMT -7. The time now is 08:56 AM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0 ©2008, Crawlability, Inc.
©2005 - 2009 Math Help Forum


Math Help Forum is a community of maths forums with an emphasis on maths help in all levels of mathematics.
Register to post your math questions or just hang out and try some of our math games or visit the arcade.