While the role of genetic testing in advancing our understanding of cancer and designing more precise and effective treatments holds much promise, progress has been slow due to significant amount of manual work still required to understand genomics. For the past several years, world-class researchers at Memorial Sloan Kettering Cancer Center have worked to create an expert-annotated precision oncology knowledge base. It contains several thousand annotations of which genes are clinically actionable and which are not based on clinical literature. This dataset can be used to train machine learning models to help experts significantly speed up their research.
This competition is a challenge to develop classification models which analyze abstracts of medical articles and, based on their content accurately determine mutation effect (9 classes) of the genes discussed in them. Participants will not only have an opportunity to work with real-world data and get to answer one of the key open questions in cancer genetics and precision medicine, but the winning model will be tested and deployed at Memorial Sloan Kettering and will have the potential to touch more than 120,000 patients it sees every year, and many more around the world.
Organizers - Memorial Sloan Kettering Cancer Center
Memorial Sloan Kettering Cancer Center (MSK) is the leading Research Medical Center in the United States. MSK has devoted more than 130 years to exceptional patient care, innovative research, and outstanding educational programs. Today, we are one of 69 National Cancer Institute–designated Comprehensive Cancer Centers, with state-of-the-art science flourishing side by side with clinical studies and treatment. The close collaboration between our physicians and scientists is one of our unique strengths, enabling us to provide patients with the best care available as we work to discover more-effective strategies to prevent, control, and ultimately cure cancer in the future.
June 26, 2017: Competition begins - test set available for participants October 19, 2017: Competition closes - deadline for submitting models October 27, 2017: Organizers to announce winning models November 10, 2017: Participants to report back any possible errors with the evaluation of their models by organizers December 9, 2017: Winning models and results published
Participants will have access to the training set and the test set without labels on the start date of the competition. They will use the data set to train and test their models on. Once they feel comfortable with the performance of a model, they will have to submit the labels for the test set to the organizers who will be report back the F1 score to the participants. Since the training set and the test set are publicly available for research purposes, we will also request the winning participants to make their code publicly available.