Saturday, April 8, 2017

Air Quality In Poland #10 - And the winner is?

So, is Rybnik really most polluted place in Poland in 2015? Maybe not. If we would like to know that answer we just need to change data frame slice from previous post
#reducedDataFrame = bigDataFrame['2015-01-01 00:00:00':'2015-12-31 23:00:00'].loc[(slice(None),pollutedPlaces), :]
reducedDataFrame = bigDataFrame['2015-01-01 00:00:00':'2015-12-31 23:00:00'].loc[(slice(None), slice(None)), :]
view raw slice.py hosted with ❤ by GitHub
and run the same procedures. While running rest of the previous code, I found unexpected behavior. It turns out, that some stations measured pollution levels which were below zero. Previously I assumed that data is more or less clean, so I don't have to look for such errors. But I was wrong. Luckily, it was pretty easy to fix that - I just need to put NaN everywhere where pollutant concentration is below zero. Example of fixed classification:
def C6H6qual (value):
if (value < 0.0):
return np.NaN
elif (value >= 0.0 and value <= 5.0):
return "1 Very good"
elif (value > 5.0 and value <= 10.0):
return "2 Good"
elif (value > 10.0 and value <= 15.0):
return "3 Moderate"
elif (value > 15.0 and value <= 20.0):
return "4 Sufficient"
elif (value > 20.0 and value <= 50.0):
return "5 Bad"
elif (value > 50.0):
return "6 Very bad"
else:
return value
view raw aqip10-class.py hosted with ❤ by GitHub
How to find really bad place after those changes? We need to apply proper selections and we will know it immediately:
worstPlace = descriptiveFrame.xs('6 Very bad', level=1)["overall"].idxmax()
descriptiveFrame.xs(worstPlace, level=0)
view raw aqip10-worst.py hosted with ❤ by GitHub
And here is the table with results:
So, from which place we have such bad results? The place is .... MpKrakAlKras which is 13, Aleja Zygmunta Krasińskiego, Półwsie Zwierzynieckie, Zwierzyniec, Krakow, Lesser Poland Voivodeship, 31-111, Poland. Here is the map of its neighborhood:

And what about best air quality place? I don't know. There are places which are not measuring all important pollutants. I think that I could fill missing values and then repeat those analysis. But this is material for further blog posts ;). Stay tuned.

Repository for source code used for this analysis: https://github.com/QuantumDamage/AQIP


No comments:

Post a Comment