@@ -1093,3 +1093,89 @@ plt.show()
1093
1093
1094
1094
``` {solution-end}
1095
1095
```
1096
+
1097
+ ``` {exercise}
1098
+ :label: inequality_ex3
1099
+
1100
+ The {ref}`code to compute the Gini coefficient is listed in the lecture above <code:gini-coefficient>`.
1101
+
1102
+ This code uses loops to calculate the coefficient based on income or wealth data.
1103
+
1104
+ This function can be re-written using vectorization which will greatly improve the computational efficiency when using `python`.
1105
+
1106
+ Re-write the function `gini_coefficient` using `numpy` and vectorized code.
1107
+
1108
+ You can compare the output of this new function with the one above, and note the speed differences.
1109
+ ```
1110
+
1111
+ ``` {solution-start} inequality_ex3
1112
+ :class: dropdown
1113
+ ```
1114
+
1115
+ Let's take a look at some raw data for the US that is stored in ` df_income_wealth `
1116
+
1117
+ ``` {code-cell} ipython3
1118
+ df_income_wealth.describe()
1119
+ ```
1120
+
1121
+ ``` {code-cell} ipython3
1122
+ df_income_wealth.head(n=4)
1123
+ ```
1124
+
1125
+ We will focus on wealth variable ` n_wealth ` to compute a Gini coefficient for the year 1990.
1126
+
1127
+ ``` {code-cell} ipython3
1128
+ data = df_income_wealth[df_income_wealth.year == 2016]
1129
+ ```
1130
+
1131
+ ``` {code-cell} ipython3
1132
+ data.head(n=2)
1133
+ ```
1134
+
1135
+ We can first compute the Gini coefficient using the function defined in the lecture above.
1136
+
1137
+ ``` {code-cell} ipython3
1138
+ gini_coefficient(data.n_wealth.values)
1139
+ ```
1140
+
1141
+ Now we can write a vectorized version using ` numpy `
1142
+
1143
+ ``` {code-cell} ipython3
1144
+ def gini(y):
1145
+ n = len(y)
1146
+ y_1 = np.reshape(y, (n, 1))
1147
+ y_2 = np.reshape(y, (1, n))
1148
+ g_sum = np.sum(np.abs(y_1 - y_2))
1149
+ return g_sum / (2 * n * np.sum(y))
1150
+ ```
1151
+ ``` {code-cell} ipython3
1152
+ gini(data.n_wealth.values)
1153
+ ```
1154
+ Let's simulate five populations by drawing from a lognormal distribution as before
1155
+
1156
+ ``` {code-cell} ipython3
1157
+ k = 5
1158
+ σ_vals = np.linspace(0.2, 4, k)
1159
+ n = 2_000
1160
+ σ_vals = σ_vals.reshape((k,1))
1161
+ μ_vals = -σ_vals**2/2
1162
+ y_vals = np.exp(μ_vals + σ_vals*np.random.randn(n))
1163
+ ```
1164
+ We can compute the Gini coefficient for these five populations using the vectorized function as follows,
1165
+
1166
+ ``` {code-cell} ipython3
1167
+ gini_coefficients =[]
1168
+ for i in range(k):
1169
+ gini_coefficients.append(gini(simulated_data[i]))
1170
+ ```
1171
+
1172
+ This gives us the Gini coefficients for these five households.
1173
+
1174
+ ``` {code-cell} ipython3
1175
+ gini_coefficients
1176
+ ```
1177
+ ``` {solution-end}
1178
+ ```
1179
+
1180
+
1181
+
0 commit comments