{"id":31,"date":"2019-12-02T14:00:00","date_gmt":"2019-12-02T13:00:00","guid":{"rendered":"https:\/\/www.codemotion.com\/magazine\/scaling-is-caring-scalable-pipelines-for-machine-learning-in-healthcare\/"},"modified":"2021-12-23T15:15:02","modified_gmt":"2021-12-23T14:15:02","slug":"scaling-is-caring-scalable-pipelines-for-machine-learning-in-healthcare","status":"publish","type":"post","link":"https:\/\/www.codemotion.com\/magazine\/ai-ml\/machine-learning\/scaling-is-caring-scalable-pipelines-for-machine-learning-in-healthcare\/","title":{"rendered":"Scaling is Caring: scalable pipelines for machine learning in healthcare"},"content":{"rendered":"<p><strong>Artificial intelligence<\/strong> is a true game-changer in many fields. But in <strong>healthcare<\/strong>, it promises to actually save and transform lives. Pacmed is a Dutch startup specialising in applying <abbr title=\"Artificial Intelligence\">AI<\/abbr> to healthcare, focusing on intensive care units. In his talk at <strong>Codemotion Amsterdam 2019<\/strong>, Data Scientist <strong>Michele Tonuti<\/strong> explained how they were able to create a scalable pipeline for finding features in complex healthcare data.<\/p>\n<h2>AI in the ICU<\/h2>\n<p>AI (or specifically <strong>machine learning<\/strong>) offers several potential benefits in an intensive care setting:<\/p>\n<ul>\n<li>ML models can be created to help support doctors making discharge decisions (is a patient well enough to be released?)<\/li>\n<li>AI can help doctors determine if someone can be safely extubated (that is, have the breathing tube removed from their throat)<\/li>\n<li>It can be used to allow ward managers to predict and control capacity in the <abbr title=\"Intensive Care Unit\">ICU<\/abbr><\/li>\n<li>Finally, it can help predict when a patient may be at risk of complications by spotting patterns in their observations<\/li>\n<\/ul>\n<p>Key to all these use cases is the ability to extract and identify features in the observation data from patients.<\/p>\n<h2>Finding features in complex data<\/h2>\n<p>In an ICU dozens of different observations are taken from each patient daily. These include physical observations (respirations, pulse rate, blood pressure, SPO2, etc.), but they also include laboratory test results. In addition, there are <strong>numerous standard pieces of data<\/strong> such as age, gender, medical conditions, allergies, etc. In total, there are maybe 100 or more observations, some taken as frequently as every 15 minutes. On top of that, there is also data relating to medication. To make things harder, every ICU has different approaches to how they measure data and even see different results for the exact same test.<\/p>\n<p>In order to be able to do something useful with this <strong>huge dataset<\/strong>, Pacmed had to be able to extract features from it \u2013 so-called feature engineering. As an added twist, data protection laws mean that they have to be able to run their feature engineering pipeline on-site without access to the sort of super-computing cluster that is usually used for such jobs. In short, what they needed to create was a <strong>scalable, repeatable and efficient pipeline that can operate equally well on a cluster or on a laptop<\/strong>. Moreover, the results had to be easy for a doctor to interpret.<\/p>\n<p>The key thing with many of these observations is that the instantaneous reading isn\u2019t necessarily important. What matters is the trend over time. Typically, when presented with such a complex dataset, a data scientist will turn to one of several standard techniques to extract useful features from it. These standard techniques include:<\/p>\n<ul>\n<li><strong>Deep learning<\/strong>, recurrent artificial neural networks and long short-term memory models (LSTMs).<\/li>\n<li><strong>Fourier transforms<\/strong>, which are a classical way to extract information from signals that vary in time.<\/li>\n<li>Patient2Vec, described as \u201c<strong>A Personalized Interpretable Deep Representation of the Longitudinal Electronic Health Record<\/strong>\u201d<\/li>\n<\/ul>\n<p>Pacmed explored all these techniques and found that none of them achieved what they needed, either because they didn\u2019t scale, the data was too complex or there simply wasn\u2019t enough data (in the case of deep learning). So, how to convert the complex, time-variable data into something that doctors can understand?<\/p>\n<h2>Simpler is better<\/h2>\n<p>Having rejected the typical complex data science techniques, Pacmed turned instead to more <strong>traditional statistical features<\/strong>. Things like max, min, mean, standard deviation, rate of change, etc. Importantly, they found that it was useful to look at these over multiple different time windows. For instance, the last day, the entire stay, the first day compared to the current.<\/p>\n<p>In most ICUs, the raw data is available as a massive table of patient ID, time, observation type and observed value. When faced with such data, the first place to turn is the <strong>Python Pandas library<\/strong>. This is specifically designed to handle such tabular data using a Split-Apply-Combine paradigm to perform calculations:<\/p>\n<p><script src=\"https:\/\/pastebin.com\/embed_js\/qbEn8BcZ\"><\/script><\/p>\n<p>Pandas is especially useful as it includes a <strong>grouper function<\/strong>. This allows grouping to be done in time windows:<\/p>\n<p><script src=\"https:\/\/pastebin.com\/embed_js\/m7KaV9LB\"><\/script><\/p>\n<h2>So, isn\u2019t Pandas ideal?<\/h2>\n<p>Pandas has many <strong>plus points<\/strong> that make it seem like the ideal solution for the problem:<\/p>\n<ul>\n<li>Easy to interpret, easy to use and reliable.<\/li>\n<li>Works well with time\/date-time information.<\/li>\n<li>Offers a good selection of aggregations\/statistical functions.<\/li>\n<\/ul>\n<p>However, it has some <strong>negatives<\/strong>, particularly relating to <strong>scalability<\/strong>:<\/p>\n<ul>\n<li>No out-of-the-box parallelisation.<\/li>\n<li>Everything is stored and processed in memory.<\/li>\n<li>Custom aggregations are really heavy computationally.<\/li>\n<\/ul>\n<p><script src=\"https:\/\/pastebin.com\/embed_js\/BMD41uYx\"><\/script><\/p>\n<p>Fortunately, the <strong>DASK library<\/strong> makes it easy to <strong>parallelise Pandas<\/strong> (as well as <strong>numpy<\/strong> and <strong>scikit-learn<\/strong>). It allows you to scale up and work on large datasets that can\u2019t fit in memory. It also lets you use standard Pandas operations (e.g. groupby, join and grouper) in distributed clusters. Equally, it makes it easy to scale-down to work on machines with limited resources (e.g. a laptop).<\/p>\n<p><script src=\"https:\/\/pastebin.com\/embed_js\/fUG6sSX5\"><\/script><\/p>\n<h2>So, surely Dask\/Pandas is perfect?<\/h2>\n<p>Unfortunately, no! There\u2019s a few significant issues with Dask. Firstly, it doesn\u2019t implement all the aggregations that are available in Pandas (e.g. it can\u2019t apply custom functions on expanding time windows). Secondly, it has many parameters that have to be optimised such as number of workers, size of partition, etc. But it is extremely sensitive to these parameters. Changing one parameter slightly can dramatically affect performance. Finally, it is actually slower when you run it with smaller datasets.<\/p>\n<p><center><img decoding=\"async\" class=\"aligncenter size-full wp-image-5485\" style=\"max-width: 500px; height: auto;\" src=\"https:\/\/www.codemotion.com\/magazine\/wp-content\/uploads\/2019\/08\/graph.png\" alt=\"\" \/><\/center>As you can see in the graph, DASK outperforms Pandas once there are more than 5,000 data fields. But what is really significant is the line showing <strong>numpy<\/strong>. Numpy is a well-known library for data analysis in Python. It uses <strong>vectors<\/strong> for all calculations (rather than data frames). Significantly, it uses native C code to perform the actual calculation. However, it is unable to deal with structured date-time data.<\/p>\n<h2>TSFRESH \u2013 the perfect compromise<\/h2>\n<p>Fortunately, there is a (relatively) new library called <strong>TSFRESH<\/strong> (Time Series Feature Extraction based on Scalable Hypothesis testing). This library uses the same split-apply-combine logic as Pandas as well as the same data structures. But it uses numpy for all calculations. It also offers a huge list of aggregates out of the box, many of which are useful for time series data. It can scale down well, using multiprocessing. It can also scale up to cover clusters using Dask. Using this approach, Michele was able to analyse a dataset with 1650 columns and 2240 rows in just 1m26s using his MacBook.<\/p>\n<p>However, TSFRESH is unable to deal with date-time features. As a result, the <strong>Pacmed team has created a custom fork<\/strong>. This uses the Pandas data frame when dealing with time-dependent aggregations, otherwise sticking with numpy vectors.<\/p>\n<h2>Conclusions<\/h2>\n<p>The important conclusion is that you should <strong>always try to find and adapt an existing solution<\/strong>. This can save you significant time and effort. Also, <strong>don\u2019t be afraid to look at traditional statistical techniques<\/strong>. Machine learning is great, but only if you have enough data (a point Michele made in response to a question from the audience). Sadly, despite the wide range of observations collected for every patient, ICUs will never generate the millions of data points needed for deep learning to perform well.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Artificial intelligence is a true game-changer in many fields. But in healthcare, it promises to actually save and transform lives. Pacmed is a Dutch startup specialising in applying AI to healthcare, focusing on intensive care units. In his talk at Codemotion Amsterdam 2019, Data Scientist Michele Tonuti explained how they were able to create a&#8230; <a class=\"more-link\" href=\"https:\/\/www.codemotion.com\/magazine\/ai-ml\/machine-learning\/scaling-is-caring-scalable-pipelines-for-machine-learning-in-healthcare\/\">Read more<\/a><\/p>\n","protected":false},"author":7,"featured_media":32,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_editorskit_title_hidden":false,"_editorskit_reading_time":0,"_editorskit_is_block_options_detached":false,"_editorskit_block_options_position":"{}","_uag_custom_page_level_css":"","_genesis_hide_title":false,"_genesis_hide_breadcrumbs":false,"_genesis_hide_singular_image":false,"_genesis_hide_footer_widgets":false,"_genesis_custom_body_class":"","_genesis_custom_post_class":"","_genesis_layout":"","footnotes":""},"categories":[35],"tags":[77,68],"collections":[],"class_list":{"0":"post-31","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-machine-learning","8":"tag-codemotion-amsterdam","9":"tag-python","10":"entry"},"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v26.9 (Yoast SEO v26.9) - https:\/\/yoast.com\/product\/yoast-seo-premium-wordpress\/ -->\n<title>Scaling is Caring: scalable pipelines for machine learning in healthcare - Codemotion Magazine<\/title>\n<meta name=\"description\" content=\"Healthcare applications are usually based on very high amounts of data. In this article we will see some solutions to deal with scalability using Python.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.codemotion.com\/magazine\/ai-ml\/machine-learning\/scaling-is-caring-scalable-pipelines-for-machine-learning-in-healthcare\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Scaling is Caring: scalable pipelines for machine learning in healthcare\" \/>\n<meta property=\"og:description\" content=\"Healthcare applications are usually based on very high amounts of data. In this article we will see some solutions to deal with scalability using Python.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.codemotion.com\/magazine\/ai-ml\/machine-learning\/scaling-is-caring-scalable-pipelines-for-machine-learning-in-healthcare\/\" \/>\n<meta property=\"og:site_name\" content=\"Codemotion Magazine\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/Codemotion.Italy\/\" \/>\n<meta property=\"article:published_time\" content=\"2019-12-02T13:00:00+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2021-12-23T14:15:02+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.codemotion.com\/magazine\/wp-content\/uploads\/2019\/08\/MachineLearninginMarketing-1621x1000.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1094\" \/>\n\t<meta property=\"og:image:height\" content=\"675\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Toby Moncaster\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@tobym76\" \/>\n<meta name=\"twitter:site\" content=\"@CodemotionIT\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Toby Moncaster\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"6 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.codemotion.com\/magazine\/ai-ml\/machine-learning\/scaling-is-caring-scalable-pipelines-for-machine-learning-in-healthcare\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.codemotion.com\/magazine\/ai-ml\/machine-learning\/scaling-is-caring-scalable-pipelines-for-machine-learning-in-healthcare\/\"},\"author\":{\"name\":\"Toby Moncaster\",\"@id\":\"https:\/\/www.codemotion.com\/magazine\/#\/schema\/person\/8b9f025e7d76754fb3d4ffd428b0813b\"},\"headline\":\"Scaling is Caring: scalable pipelines for machine learning in healthcare\",\"datePublished\":\"2019-12-02T13:00:00+00:00\",\"dateModified\":\"2021-12-23T14:15:02+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.codemotion.com\/magazine\/ai-ml\/machine-learning\/scaling-is-caring-scalable-pipelines-for-machine-learning-in-healthcare\/\"},\"wordCount\":1190,\"publisher\":{\"@id\":\"https:\/\/www.codemotion.com\/magazine\/#organization\"},\"image\":{\"@id\":\"https:\/\/www.codemotion.com\/magazine\/ai-ml\/machine-learning\/scaling-is-caring-scalable-pipelines-for-machine-learning-in-healthcare\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.codemotion.com\/magazine\/wp-content\/uploads\/2019\/08\/MachineLearninginMarketing-1621x1000.jpg\",\"keywords\":[\"Codemotion Amsterdam\",\"Python\"],\"articleSection\":[\"Machine Learning\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.codemotion.com\/magazine\/ai-ml\/machine-learning\/scaling-is-caring-scalable-pipelines-for-machine-learning-in-healthcare\/\",\"url\":\"https:\/\/www.codemotion.com\/magazine\/ai-ml\/machine-learning\/scaling-is-caring-scalable-pipelines-for-machine-learning-in-healthcare\/\",\"name\":\"Scaling is Caring: scalable pipelines for machine learning in healthcare - Codemotion Magazine\",\"isPartOf\":{\"@id\":\"https:\/\/www.codemotion.com\/magazine\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.codemotion.com\/magazine\/ai-ml\/machine-learning\/scaling-is-caring-scalable-pipelines-for-machine-learning-in-healthcare\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.codemotion.com\/magazine\/ai-ml\/machine-learning\/scaling-is-caring-scalable-pipelines-for-machine-learning-in-healthcare\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.codemotion.com\/magazine\/wp-content\/uploads\/2019\/08\/MachineLearninginMarketing-1621x1000.jpg\",\"datePublished\":\"2019-12-02T13:00:00+00:00\",\"dateModified\":\"2021-12-23T14:15:02+00:00\",\"description\":\"Healthcare applications are usually based on very high amounts of data. In this article we will see some solutions to deal with scalability using Python.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.codemotion.com\/magazine\/ai-ml\/machine-learning\/scaling-is-caring-scalable-pipelines-for-machine-learning-in-healthcare\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.codemotion.com\/magazine\/ai-ml\/machine-learning\/scaling-is-caring-scalable-pipelines-for-machine-learning-in-healthcare\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.codemotion.com\/magazine\/ai-ml\/machine-learning\/scaling-is-caring-scalable-pipelines-for-machine-learning-in-healthcare\/#primaryimage\",\"url\":\"https:\/\/www.codemotion.com\/magazine\/wp-content\/uploads\/2019\/08\/MachineLearninginMarketing-1621x1000.jpg\",\"contentUrl\":\"https:\/\/www.codemotion.com\/magazine\/wp-content\/uploads\/2019\/08\/MachineLearninginMarketing-1621x1000.jpg\",\"width\":1094,\"height\":675},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.codemotion.com\/magazine\/ai-ml\/machine-learning\/scaling-is-caring-scalable-pipelines-for-machine-learning-in-healthcare\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.codemotion.com\/magazine\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"AI\/ML\",\"item\":\"https:\/\/www.codemotion.com\/magazine\/ai-ml\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"Machine Learning\",\"item\":\"https:\/\/www.codemotion.com\/magazine\/ai-ml\/machine-learning\/\"},{\"@type\":\"ListItem\",\"position\":4,\"name\":\"Scaling is Caring: scalable pipelines for machine learning in healthcare\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.codemotion.com\/magazine\/#website\",\"url\":\"https:\/\/www.codemotion.com\/magazine\/\",\"name\":\"Codemotion Magazine\",\"description\":\"We code the future. Together\",\"publisher\":{\"@id\":\"https:\/\/www.codemotion.com\/magazine\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.codemotion.com\/magazine\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.codemotion.com\/magazine\/#organization\",\"name\":\"Codemotion\",\"url\":\"https:\/\/www.codemotion.com\/magazine\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.codemotion.com\/magazine\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.codemotion.com\/magazine\/wp-content\/uploads\/2019\/11\/codemotionlogo.png\",\"contentUrl\":\"https:\/\/www.codemotion.com\/magazine\/wp-content\/uploads\/2019\/11\/codemotionlogo.png\",\"width\":225,\"height\":225,\"caption\":\"Codemotion\"},\"image\":{\"@id\":\"https:\/\/www.codemotion.com\/magazine\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/Codemotion.Italy\/\",\"https:\/\/x.com\/CodemotionIT\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.codemotion.com\/magazine\/#\/schema\/person\/8b9f025e7d76754fb3d4ffd428b0813b\",\"name\":\"Toby Moncaster\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.codemotion.com\/magazine\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/126cc1a8360e8cfbfa77aefe9160c4cd916e20f2c3a849d91e1df00c48423ccc?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/126cc1a8360e8cfbfa77aefe9160c4cd916e20f2c3a849d91e1df00c48423ccc?s=96&d=mm&r=g\",\"caption\":\"Toby Moncaster\"},\"description\":\"I am an experienced freelance writer. I specialise in making complex topics accessible to wider audiences. My interests include TCP\/IP, data protection and AI. I currently work with B2B startups across the world. I hold 5 patents, edited 3 RFCs and received a PhD in computer science from the University of Cambridge.\",\"sameAs\":[\"https:\/\/www.linkedin.com\/in\/tobymoncaster\/\",\"https:\/\/x.com\/tobym76\"],\"url\":\"https:\/\/www.codemotion.com\/magazine\/author\/toby-moncaster\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Scaling is Caring: scalable pipelines for machine learning in healthcare - Codemotion Magazine","description":"Healthcare applications are usually based on very high amounts of data. In this article we will see some solutions to deal with scalability using Python.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.codemotion.com\/magazine\/ai-ml\/machine-learning\/scaling-is-caring-scalable-pipelines-for-machine-learning-in-healthcare\/","og_locale":"en_US","og_type":"article","og_title":"Scaling is Caring: scalable pipelines for machine learning in healthcare","og_description":"Healthcare applications are usually based on very high amounts of data. In this article we will see some solutions to deal with scalability using Python.","og_url":"https:\/\/www.codemotion.com\/magazine\/ai-ml\/machine-learning\/scaling-is-caring-scalable-pipelines-for-machine-learning-in-healthcare\/","og_site_name":"Codemotion Magazine","article_publisher":"https:\/\/www.facebook.com\/Codemotion.Italy\/","article_published_time":"2019-12-02T13:00:00+00:00","article_modified_time":"2021-12-23T14:15:02+00:00","og_image":[{"width":1094,"height":675,"url":"https:\/\/www.codemotion.com\/magazine\/wp-content\/uploads\/2019\/08\/MachineLearninginMarketing-1621x1000.jpg","type":"image\/jpeg"}],"author":"Toby Moncaster","twitter_card":"summary_large_image","twitter_creator":"@tobym76","twitter_site":"@CodemotionIT","twitter_misc":{"Written by":"Toby Moncaster","Est. reading time":"6 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.codemotion.com\/magazine\/ai-ml\/machine-learning\/scaling-is-caring-scalable-pipelines-for-machine-learning-in-healthcare\/#article","isPartOf":{"@id":"https:\/\/www.codemotion.com\/magazine\/ai-ml\/machine-learning\/scaling-is-caring-scalable-pipelines-for-machine-learning-in-healthcare\/"},"author":{"name":"Toby Moncaster","@id":"https:\/\/www.codemotion.com\/magazine\/#\/schema\/person\/8b9f025e7d76754fb3d4ffd428b0813b"},"headline":"Scaling is Caring: scalable pipelines for machine learning in healthcare","datePublished":"2019-12-02T13:00:00+00:00","dateModified":"2021-12-23T14:15:02+00:00","mainEntityOfPage":{"@id":"https:\/\/www.codemotion.com\/magazine\/ai-ml\/machine-learning\/scaling-is-caring-scalable-pipelines-for-machine-learning-in-healthcare\/"},"wordCount":1190,"publisher":{"@id":"https:\/\/www.codemotion.com\/magazine\/#organization"},"image":{"@id":"https:\/\/www.codemotion.com\/magazine\/ai-ml\/machine-learning\/scaling-is-caring-scalable-pipelines-for-machine-learning-in-healthcare\/#primaryimage"},"thumbnailUrl":"https:\/\/www.codemotion.com\/magazine\/wp-content\/uploads\/2019\/08\/MachineLearninginMarketing-1621x1000.jpg","keywords":["Codemotion Amsterdam","Python"],"articleSection":["Machine Learning"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.codemotion.com\/magazine\/ai-ml\/machine-learning\/scaling-is-caring-scalable-pipelines-for-machine-learning-in-healthcare\/","url":"https:\/\/www.codemotion.com\/magazine\/ai-ml\/machine-learning\/scaling-is-caring-scalable-pipelines-for-machine-learning-in-healthcare\/","name":"Scaling is Caring: scalable pipelines for machine learning in healthcare - Codemotion Magazine","isPartOf":{"@id":"https:\/\/www.codemotion.com\/magazine\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.codemotion.com\/magazine\/ai-ml\/machine-learning\/scaling-is-caring-scalable-pipelines-for-machine-learning-in-healthcare\/#primaryimage"},"image":{"@id":"https:\/\/www.codemotion.com\/magazine\/ai-ml\/machine-learning\/scaling-is-caring-scalable-pipelines-for-machine-learning-in-healthcare\/#primaryimage"},"thumbnailUrl":"https:\/\/www.codemotion.com\/magazine\/wp-content\/uploads\/2019\/08\/MachineLearninginMarketing-1621x1000.jpg","datePublished":"2019-12-02T13:00:00+00:00","dateModified":"2021-12-23T14:15:02+00:00","description":"Healthcare applications are usually based on very high amounts of data. In this article we will see some solutions to deal with scalability using Python.","breadcrumb":{"@id":"https:\/\/www.codemotion.com\/magazine\/ai-ml\/machine-learning\/scaling-is-caring-scalable-pipelines-for-machine-learning-in-healthcare\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.codemotion.com\/magazine\/ai-ml\/machine-learning\/scaling-is-caring-scalable-pipelines-for-machine-learning-in-healthcare\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.codemotion.com\/magazine\/ai-ml\/machine-learning\/scaling-is-caring-scalable-pipelines-for-machine-learning-in-healthcare\/#primaryimage","url":"https:\/\/www.codemotion.com\/magazine\/wp-content\/uploads\/2019\/08\/MachineLearninginMarketing-1621x1000.jpg","contentUrl":"https:\/\/www.codemotion.com\/magazine\/wp-content\/uploads\/2019\/08\/MachineLearninginMarketing-1621x1000.jpg","width":1094,"height":675},{"@type":"BreadcrumbList","@id":"https:\/\/www.codemotion.com\/magazine\/ai-ml\/machine-learning\/scaling-is-caring-scalable-pipelines-for-machine-learning-in-healthcare\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.codemotion.com\/magazine\/"},{"@type":"ListItem","position":2,"name":"AI\/ML","item":"https:\/\/www.codemotion.com\/magazine\/ai-ml\/"},{"@type":"ListItem","position":3,"name":"Machine Learning","item":"https:\/\/www.codemotion.com\/magazine\/ai-ml\/machine-learning\/"},{"@type":"ListItem","position":4,"name":"Scaling is Caring: scalable pipelines for machine learning in healthcare"}]},{"@type":"WebSite","@id":"https:\/\/www.codemotion.com\/magazine\/#website","url":"https:\/\/www.codemotion.com\/magazine\/","name":"Codemotion Magazine","description":"We code the future. Together","publisher":{"@id":"https:\/\/www.codemotion.com\/magazine\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.codemotion.com\/magazine\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.codemotion.com\/magazine\/#organization","name":"Codemotion","url":"https:\/\/www.codemotion.com\/magazine\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.codemotion.com\/magazine\/#\/schema\/logo\/image\/","url":"https:\/\/www.codemotion.com\/magazine\/wp-content\/uploads\/2019\/11\/codemotionlogo.png","contentUrl":"https:\/\/www.codemotion.com\/magazine\/wp-content\/uploads\/2019\/11\/codemotionlogo.png","width":225,"height":225,"caption":"Codemotion"},"image":{"@id":"https:\/\/www.codemotion.com\/magazine\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/Codemotion.Italy\/","https:\/\/x.com\/CodemotionIT"]},{"@type":"Person","@id":"https:\/\/www.codemotion.com\/magazine\/#\/schema\/person\/8b9f025e7d76754fb3d4ffd428b0813b","name":"Toby Moncaster","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.codemotion.com\/magazine\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/126cc1a8360e8cfbfa77aefe9160c4cd916e20f2c3a849d91e1df00c48423ccc?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/126cc1a8360e8cfbfa77aefe9160c4cd916e20f2c3a849d91e1df00c48423ccc?s=96&d=mm&r=g","caption":"Toby Moncaster"},"description":"I am an experienced freelance writer. I specialise in making complex topics accessible to wider audiences. My interests include TCP\/IP, data protection and AI. I currently work with B2B startups across the world. I hold 5 patents, edited 3 RFCs and received a PhD in computer science from the University of Cambridge.","sameAs":["https:\/\/www.linkedin.com\/in\/tobymoncaster\/","https:\/\/x.com\/tobym76"],"url":"https:\/\/www.codemotion.com\/magazine\/author\/toby-moncaster\/"}]}},"featured_image_src":"https:\/\/www.codemotion.com\/magazine\/wp-content\/uploads\/2019\/08\/MachineLearninginMarketing-1621x1000-600x400.jpg","featured_image_src_square":"https:\/\/www.codemotion.com\/magazine\/wp-content\/uploads\/2019\/08\/MachineLearninginMarketing-1621x1000-600x600.jpg","author_info":{"display_name":"Toby Moncaster","author_link":"https:\/\/www.codemotion.com\/magazine\/author\/toby-moncaster\/"},"uagb_featured_image_src":{"full":["https:\/\/www.codemotion.com\/magazine\/wp-content\/uploads\/2019\/08\/MachineLearninginMarketing-1621x1000.jpg",1094,675,false],"thumbnail":["https:\/\/www.codemotion.com\/magazine\/wp-content\/uploads\/2019\/08\/MachineLearninginMarketing-1621x1000-150x150.jpg",150,150,true],"medium":["https:\/\/www.codemotion.com\/magazine\/wp-content\/uploads\/2019\/08\/MachineLearninginMarketing-1621x1000-300x185.jpg",300,185,true],"medium_large":["https:\/\/www.codemotion.com\/magazine\/wp-content\/uploads\/2019\/08\/MachineLearninginMarketing-1621x1000-768x474.jpg",768,474,true],"large":["https:\/\/www.codemotion.com\/magazine\/wp-content\/uploads\/2019\/08\/MachineLearninginMarketing-1621x1000-1024x632.jpg",1024,632,true],"1536x1536":["https:\/\/www.codemotion.com\/magazine\/wp-content\/uploads\/2019\/08\/MachineLearninginMarketing-1621x1000.jpg",1094,675,false],"2048x2048":["https:\/\/www.codemotion.com\/magazine\/wp-content\/uploads\/2019\/08\/MachineLearninginMarketing-1621x1000.jpg",1094,675,false],"small-home-featured":["https:\/\/www.codemotion.com\/magazine\/wp-content\/uploads\/2019\/08\/MachineLearninginMarketing-1621x1000.jpg",100,62,false],"sidebar-featured":["https:\/\/www.codemotion.com\/magazine\/wp-content\/uploads\/2019\/08\/MachineLearninginMarketing-1621x1000-180x128.jpg",180,128,true],"genesis-singular-images":["https:\/\/www.codemotion.com\/magazine\/wp-content\/uploads\/2019\/08\/MachineLearninginMarketing-1621x1000-896x504.jpg",896,504,true],"archive-featured":["https:\/\/www.codemotion.com\/magazine\/wp-content\/uploads\/2019\/08\/MachineLearninginMarketing-1621x1000-400x225.jpg",400,225,true],"gb-block-post-grid-landscape":["https:\/\/www.codemotion.com\/magazine\/wp-content\/uploads\/2019\/08\/MachineLearninginMarketing-1621x1000-600x400.jpg",600,400,true],"gb-block-post-grid-square":["https:\/\/www.codemotion.com\/magazine\/wp-content\/uploads\/2019\/08\/MachineLearninginMarketing-1621x1000-600x600.jpg",600,600,true]},"uagb_author_info":{"display_name":"Toby Moncaster","author_link":"https:\/\/www.codemotion.com\/magazine\/author\/toby-moncaster\/"},"uagb_comment_info":0,"uagb_excerpt":"Artificial intelligence is a true game-changer in many fields. But in healthcare, it promises to actually save and transform lives. Pacmed is a Dutch startup specialising in applying AI to healthcare, focusing on intensive care units. In his talk at Codemotion Amsterdam 2019, Data Scientist Michele Tonuti explained how they were able to create a&#8230;&hellip;","lang":"en","_links":{"self":[{"href":"https:\/\/www.codemotion.com\/magazine\/wp-json\/wp\/v2\/posts\/31","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.codemotion.com\/magazine\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.codemotion.com\/magazine\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.codemotion.com\/magazine\/wp-json\/wp\/v2\/users\/7"}],"replies":[{"embeddable":true,"href":"https:\/\/www.codemotion.com\/magazine\/wp-json\/wp\/v2\/comments?post=31"}],"version-history":[{"count":3,"href":"https:\/\/www.codemotion.com\/magazine\/wp-json\/wp\/v2\/posts\/31\/revisions"}],"predecessor-version":[{"id":1882,"href":"https:\/\/www.codemotion.com\/magazine\/wp-json\/wp\/v2\/posts\/31\/revisions\/1882"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.codemotion.com\/magazine\/wp-json\/wp\/v2\/media\/32"}],"wp:attachment":[{"href":"https:\/\/www.codemotion.com\/magazine\/wp-json\/wp\/v2\/media?parent=31"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.codemotion.com\/magazine\/wp-json\/wp\/v2\/categories?post=31"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.codemotion.com\/magazine\/wp-json\/wp\/v2\/tags?post=31"},{"taxonomy":"collections","embeddable":true,"href":"https:\/\/www.codemotion.com\/magazine\/wp-json\/wp\/v2\/collections?post=31"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}