
Feature engineering is one of the most difficult and time-consuming aspects of a data scientist’s job, and the correct choice of features often determines whether or not a learning algorithm is effective. FeatureHub makes this step easier, faster, and more effective by collaboratively crowdsourcing feature creation via an open-source web notebook. FeatureHub gathers, tests, collates and extracts features submitted by users, and uses them to train relevant machine learning models. We are continuously improving FeatureHub’s usability, and plan to test models crowdsourced by it against those created through competition-based crowdsourcing.
Publications
FeatureHub: Towards Collaborative Data Science (PDF)
Micah Smith, Roy Wedge, Kalyan Veeramachaneni. IEEE International Conference on Data Science and Advance Analytics, Tokyo, Japan. October, 2017.
Feature Factory: A Collaborative, Crowd-Sourced Machine Learning System. (PDF)
Alex Wang, MEng Thesis, EECS, MIT, 2015. Advisor: Kalyan Veeramachaneni
Crowd Sourced Feature Discovery (PDF)
Kalyan Veeramachaneni, Kiarash Adl, Una-May O’Reilly. Proceedings of the Second ACM Conference on Learning @ Scale, 2015.
Contributors
Roy Wedge
Micah Smith
Alex Wang
Press
October 30, 2017 — Crowdsourcing big-data analysis — MIT News
November 3, 2017 — MIT boffins hope to speed up analytics with GitHub-style platform — The Register
November 9, 2017 — Research Project Figures Out How to Crowdsource Predictive Models — Campus Technology