So, is Rybnik really most polluted place in Poland in 2015? Maybe not. If we would like to know that answer we just need to change data frame slice from previous post
and run the same procedures. While running rest of the previous code, I found unexpected behavior. It turns out, that some stations measured pollution levels which were below zero. Previously I assumed that data is more or less clean, so I don't have to look for such errors. But I was wrong. Luckily, it was pretty easy to fix that - I just need to put NaN everywhere where pollutant concentration is below zero. Example of fixed classification:
How to find really bad place after those changes? We need to apply proper selections and we will know it immediately:
And here is the table with results:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#reducedDataFrame = bigDataFrame['2015-01-01 00:00:00':'2015-12-31 23:00:00'].loc[(slice(None),pollutedPlaces), :] | |
reducedDataFrame = bigDataFrame['2015-01-01 00:00:00':'2015-12-31 23:00:00'].loc[(slice(None), slice(None)), :] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def C6H6qual (value): | |
if (value < 0.0): | |
return np.NaN | |
elif (value >= 0.0 and value <= 5.0): | |
return "1 Very good" | |
elif (value > 5.0 and value <= 10.0): | |
return "2 Good" | |
elif (value > 10.0 and value <= 15.0): | |
return "3 Moderate" | |
elif (value > 15.0 and value <= 20.0): | |
return "4 Sufficient" | |
elif (value > 20.0 and value <= 50.0): | |
return "5 Bad" | |
elif (value > 50.0): | |
return "6 Very bad" | |
else: | |
return value |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
worstPlace = descriptiveFrame.xs('6 Very bad', level=1)["overall"].idxmax() | |
descriptiveFrame.xs(worstPlace, level=0) |
So, from which place we have such bad results? The place is .... MpKrakAlKras which is 13, Aleja Zygmunta Krasińskiego, Półwsie Zwierzynieckie, Zwierzyniec, Krakow, Lesser Poland Voivodeship, 31-111, Poland. Here is the map of its neighborhood:
And what about best air quality place? I don't know. There are places which are not measuring all important pollutants. I think that I could fill missing values and then repeat those analysis. But this is material for further blog posts ;). Stay tuned.
Repository for source code used for this analysis: https://github.com/QuantumDamage/AQIP
No comments:
Post a Comment