As a journalist, it is important to get data from reliable sources. One such source is Danmarks Statistik: http://www.dst.dk/da

 

In this post, we are going to collect data from dst.dk and import it to Google Fusion Tables.

Also, we are going to encounter lots of problems and we have to learn, how to be flexible and how to deal with them.

First, click on “Find Statistik” and then “Statistikbanken”, here you can find a wealth of information. I chose the following table: http://www.statistikbanken.dk/BOP6

Clicking on this link will show you the following site:Image

 

Here, you can choose what data specifically you want to see.

I was interested in the transport sector, especially in relation to Germany over the course of the last decade. If you press the control-key on your keyboard, you can choose multiple entries in each list, my selection looked like this:Image

Then you can click “Vis Tabel” and you will see the data set, and be able to download it as a file.

Google Tables understands .xls and tab separated .cvs files. It also supports other, but i chose the tab-separated file this time around.Image

 

We now want to import the file to Fusion Tables. Go to fusion tables http://www.google.com/drive/start/apps.html#fusiontables and create a new table, choose to upload the file and select “tab” as your separator character.Image

After the file is uploaded, you will notice that the table does not look reasonable at all!Image

Apparently, we have to clean our data set first, so Fusion Tables can understand it. Use a text-editor, f.eks Notepad++ for windows or gedit for Linux to open the file.

Here is what the file looks like for me:Image

 

We can see that the file has multiple parts, one for “Tyskland” and one for “Lande i alt” and some more information, like the name of the table. To fix this, i am going to create two new, empty files and copy and paste the data for Germany and the other countries separately. Then i add the column names ( years ) with tabs in between, make sure there are tabs between the numbers and delete unnecessary quotes.

The file for the other countries looks like this now:Image

Which i now can upload to fusion tables, which now understands the file properly

Image

 

On the next page we attribute the data to “Danmark Statistik” and provide a link, to where the data can be found.

Image

We end up with a nice table

Image

I apply the exact same process to the table with the Germany data

You can now press the + button and select “new chart”, here you can play around with different visualizations of the data. When i did this, i found that none of the visualizations really showed what i wanted. Then it hit me: the organization of the data in the table made not a lot of sense.

It is much more useful to have years organized in rows instead of having one column for each year.

I went back to the Danmark Statistik webpage and looked for options to change the format of the data. Looks like we are in luck. The page offers us an option to pivot the dataset. After some fiddling around i came up with the following organization:Image

Now i downloaded the xls file, which can be openend in Microsoft Excel or Libre / Open Office.

After a bit of copy pasting to textfiles, i had this:Image

 

Overall, the use of the xls file format made it a lot easier to create this file.

Upload the new file to fusion tables and check out the charts again. Now the graphs make a lot more sense.

Image

 

My point with this post is, that we are going to run into lots of trouble when we are working with tables and data. Formatting and organization of data is important and we have to learn how to get it right, and what tools can support us with these tasks.

Advertisements
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s