Bug : Attempting to cluster with a column of all zeroes raises a shape mismatch error
If a particular column is all zeroes, median_imputer.transform() seems to crash and cause a shape mismatch error.
Full traceback is below
Traceback (most recent call last):
File "/home/karmanya/anaconda3/envs/gramexmaster/lib/python3.6/site-packages/tornado/web.py", line 1543, in _execute
result = yield result
File "/home/karmanya/anaconda3/envs/gramexmaster/lib/python3.6/site-packages/tornado/gen.py", line 1099, in run
value = future.result()
File "/home/karmanya/anaconda3/envs/gramexmaster/lib/python3.6/site-packages/tornado/gen.py", line 1107, in run
yielded = self.gen.throw(*exc_info)
File "/home/karmanya/anaconda3/envs/gramexmaster/lib/python3.6/site-packages/gramex/handlers/filehandler.py", line 191, in _get
yield self._get_path(self.root / path if self.root.is_dir() else self.root)
File "/home/karmanya/anaconda3/envs/gramexmaster/lib/python3.6/site-packages/tornado/gen.py", line 1099, in run
value = future.result()
File "/home/karmanya/anaconda3/envs/gramexmaster/lib/python3.6/site-packages/tornado/gen.py", line 1107, in run
yielded = self.gen.throw(*exc_info)
File "/home/karmanya/anaconda3/envs/gramexmaster/lib/python3.6/site-packages/gramex/handlers/filehandler.py", line 295, in _get_path
item = yield item
File "/home/karmanya/anaconda3/envs/gramexmaster/lib/python3.6/site-packages/tornado/gen.py", line 1099, in run
value = future.result()
File "/home/karmanya/anaconda3/envs/gramexmaster/lib/python3.6/site-packages/tornado/gen.py", line 296, in wrapper
result = func(*args, **kwargs)
File "/home/karmanya/anaconda3/envs/gramexmaster/lib/python3.6/types.py", line 248, in wrapped
coro = func(*args, **kwargs)
File "/home/karmanya/anaconda3/envs/gramexmaster/lib/python3.6/site-packages/gramex/transforms/template.py", line 15, in template
raise tornado.gen.Return(tmpl.generate(handler=handler, **kwargs).decode('utf-8'))
File "/home/karmanya/anaconda3/envs/gramexmaster/lib/python3.6/site-packages/tornado/template.py", line 346, in generate
return execute()
File "<string>.generated.py", line 335, in _tt_execute
info, result, entities = geocluster.get_clusters(handler, datasetname) # <string>:263
File "/home/karmanya/repos/geocluster/geocluster.py", line 352, in get_clusters
scaled[scaled.columns] = median_imputer.transform(scaled)
File "/home/karmanya/anaconda3/envs/gramexmaster/lib/python3.6/site-packages/pandas/core/frame.py", line 2514, in __setitem__
self._setitem_array(key, value)
File "/home/karmanya/anaconda3/envs/gramexmaster/lib/python3.6/site-packages/pandas/core/frame.py", line 2544, in _setitem_array
self.loc._setitem_with_indexer((slice(None), indexer), value)
File "/home/karmanya/anaconda3/envs/gramexmaster/lib/python3.6/site-packages/pandas/core/indexing.py", line 639, in _setitem_with_indexer
value=value)
File "/home/karmanya/anaconda3/envs/gramexmaster/lib/python3.6/site-packages/pandas/core/internals.py", line 3441, in setitem
return self.apply('setitem', **kwargs)
File "/home/karmanya/anaconda3/envs/gramexmaster/lib/python3.6/site-packages/pandas/core/internals.py", line 3329, in apply
applied = getattr(b, f)(**kwargs)
File "/home/karmanya/anaconda3/envs/gramexmaster/lib/python3.6/site-packages/pandas/core/internals.py", line 901, in setitem
values[indexer] = value
ValueError: shape mismatch: value array of shape (228,50) could not be broadcast to indexing result of shape (51,228)