SHAZAM reads text data easily if the delimiter(separator) is a space, comma or tab.
However, the file above uses a semicolon separator (applied automatically by MS Excel in some locales) which is not supported. To work with this data, replace the semicolons with either commas, tabs or spaces using a text editor.
Alternatively the 'Text to Columns' feature in MS Excel can be useful for splitting data before then saving it as a CSV format file explicitly selecting the Comma Delimited version, if it is available.
The file also has some missing values at the bottom and these should be replaced with a missing value code. The default in SHAZAM is -99999. After replacing with commas and adding the missing value code, the file then becomes: testdata1.csv
Here is a simple script to then produce the vector from the comma delimited (separated) file:
* Read the data into a matrix skipping the names on the first row.
* Note: It is required to specify the number of columns.
read(testdata1.csv) mydata / skiplines=1 cols=13
matrix mymat = mydata
* Calculate the total number of rows to be in the new vector
gen1 length = $rows * $cols
* Set the sample size for the new vector using this length
sample 1 length
* Create the vector from the matrix columns with the matrix command
matrix myvec = vec(mymat)
It is also possible to read the data as variables and use the matrix
command to concatenate them. Doing it this way means there is no need to specify the number of columns so it can sometimes be preferable. Here is how it would be done with this dataset.
* Read the data using the first row as variable names and list the data
read (testdata.shd) / names list
* Create a matrix by concatenating the variables ('|' does matrix concatenation)
matrix mymat = JAN|FEBR|MARCH|APRIL|MAY|JUNE|JULY|AUG|SEPT|OCTOB|NOV|DEC
When doing it this way for many similar files a useful trick is to create a SHAZAM Character String (akin to an alias) for the data variable concatenation. This string can be substituted into a subsequent statement using the format [alias]. For example the immediately preceding command statement may be substituted for the two below and the alias months can be inserted wherever it is needed:
* Using a character string for months
months:JAN|FEBR|MARCH|APRIL|MAY|JUNE|JULY|AUG|SEPT|OCTOB|NOV|DEC
matrix mymat = [months]
Data in SHAZAM may be represented as either variables or matrices (or vectors). It is easy to interchange between the two and SHAZAM techniques can accept both. Sometimes it is preferable to use matrix data as a variable, particularly when you want to pull out a column of a matrix to a variable and don't wish to keep specifying the column index. In the above, to convert the vector to a data variable simply do:
* Convert a vector to a variable
genr myvar = myvec
To convert it back again:
* Convert a variable to a vector ...
(more)