Feature engineering is one of the most difficult and time-consuming aspects of a data scientist’s job, and the correct choice of features often determines whether or not a learning algorithm is effective. FeatureHub makes this step easier, faster, and more effective by collaboratively crowdsourcing feature creation via an open-source web notebook. FeatureHub gathers, tests, collates and extracts features submitted by users, and uses them to train relevant machine learning models. We are continuously improving FeatureHub’s usability, and plan to test models crowdsourced by it against those created through competition-based crowdsourcing.

Publications

FeatureHub: Towards Collaborative Data Science (PDF)
Micah Smith, Roy Wedge, Kalyan Veeramachaneni. IEEE International Conference on Data Science and Advance Analytics, Tokyo, Japan. October, 2017.

Feature Factory: A Collaborative, Crowd-Sourced Machine Learning System(PDF)
Alex Wang, MEng Thesis, EECS, MIT, 2015. Advisor: Kalyan Veeramachaneni

Crowd Sourced Feature Discovery (PDF)
Kalyan Veeramachaneni, Kiarash Adl, Una-May O’Reilly. Proceedings of the Second ACM Conference on Learning @ Scale, 2015.

Contributors

Roy Wedge
Micah Smith

Alex Wang

Press

October 30, 2017 — Crowdsourcing big-data analysis — MIT News

November 9, 2017 — Research Project Figures Out How to Crowdsource Predictive Models ­— Campus Technology