InterCHU Datathon

Produce real-life algorithms

To identify duplicate patients

August - September, 2019 — France


French hospitals (interCHU group) organize a worldwide datathon during several months. The framework will be presented during Medinfo congress (August 25th - 30th).


Interoperability and data export outside hospitals are limiting factors in the development of AI in medicine.


Interchu group therefore proposes to use a synthetic dataset in a common data model (OMOP) to develop algorithms without having direct access to the data. Once developed, these algorithms will then be sent to partner centres for execution by local teams. The performance scores will then be returned to the participants.



OPEN-SOURCE

Use Framagit, Zulip. Learn about medical common data model (OMOP)

MACHINE LEARNING

Produce algorithms for helping hospital to deduplicate patients.

SHARE CODE

Share your code with French Hospitals

FAQ

What is the challenge task?


The challenge is to provide the best opensource algorithm to identify duplicated patients in a hospital in a given database. Currently, duplicate candidate patients are searched manually and the process could be improved by algorithms that identify them. Ultimately, the identified duplicate patients could be merged with a great benefit for care, cost and research.

Why do I need to learn open source?


Open source makes it easier to contribute to projects because everybody uses same set of tools, and because projects are based on a common platform.

Why do I need to learn common data model?


Common data model are used to standardize medical data and enhance interoperability. Thanks to this, it is possible to create a single version of an algorithm and share it with all OMOP data sets.

Which programming language could I use?


Python or R.

How much does it cost to attend?


Nothing. That’s free.

Are there any prizes for winners?


No, that’s only for glory.

Do I need to register?


Yes on our forum. Introduce yourself in ‘new members’ channel and create a topic with your team name in ‘datathon - teams’ channel.

Will I use real data?


No, you will use synthetic dataset (called SynPUF) to build algorithms on your personal computer. Then you will send your algorithm to hospitals. Your algorithm will be launched by hospitals on their real data set once a day from August 1st.

Which hospitals are participating?


Grenoble, Lille, Paris, Rennes and Toulouse. Osiris group (oncology) and DREES (Direction de la Recherche, des Études, de l’Évaluation et des Statistiques) are also participing.