Two years ago, when the epidemiologist John Brownstein at the Children’s Hospital at Harvard Medical School in Boston and the software developer Clark Freifeld developed the idea to travel the fathomless spaces of the global internet to find cues for a potential outbreak of plagues, the SARS crises had passed for several years. But until still was and is SARS the prototype of a modern plague: It breaks out in a difficult to access region somewhere in the world. It is hard to spot since a repressive regime rules the country. And it spreads rapidly all over the world thanks to air traffic.
Health Map wants to google plagues
Could the Western world have been informed about SARS earlier? Brwonstein is sure: “We found the earliest reports on SARS in internet chatrooms where people discussed the problems in the Guangdong province”, he reports. “At that time it was the only way for the rest of the world to find out about it.” But actually nobody really paid any attention to it at that time. Today a similar constellation would be noticed. The World Health Organization WHO filters local news more targeted than before. And with Health Map, Brownstein and Freifeld have developed a tool trying to automated filter exactly those early hints about plague outbreaks from the vast flood of information in the internet. Health Map is mainly interested in local news making their rounds in the regional press, on portals of small communities or in chatrooms or blogs. By aid of complex text-mining algorithms, the search engine combs through all sources they can access trying to identify news which might point in the direction of a potential break out of a disease. The approach works: Starting as a pure university project, Health Map now receives subsidies of the National Library of Medicine, the National Institutes of Health, the Canadian Institutes of Health Research and last but not least Google providing no less than 450,000 US Dollars. “Text-mining” sounds simple but actually is extremely complex. Because the internet is full of health-related news. But only very few are relevant for a potential break out of plagues. Freifeld emphasizes: “There are thousands of health-related articles in the internet, in scientific journals, about health policy topics. We have to separate all those from real news about potential plagues for our alerts.” Health Map makes it work by classifying the collected text snippets by region, disease and pathogenic agent. It compares news originating from similar regions at similar times, filters duplicates thus getting step by step to those few news counting in the end. Most of them are read “by hand” as an additional step for quality assurance. Only then they find their way into the end-product of Health Map available for internet users, an interactive map of the world based on Google technology. Locations of a potential outbreak of a plague are marked with small flags. The color of the flag shows how relevant the outbreak could be. “Think of it as some kind of spam-filter” says Freifeld. Just like in spam-filter, the text-mining algorithm in Health Map is permanently up-dated and adjusted to stay effective at its maximum. Just how hard this is, the two founders of Health Map describe in detail in their latest article in the PLoS Medicine magazine.
Many of the visitors are plague professionals
But Health Map is not just another interactive map: It also turns out to be an information portal. Just a few clicks open diverse sources of information such as Wikipedia as well as PubMed and Google Trends. And Health Map does not only speak English: As well known, plagues often break out in developing countries or tropical regions. Accordingly the algorithms currently cover Chinese, Russian, French and Spanish news sources. Hindi, Portuguese and Arabic are under preparation. Health Map has about 20,000 visitors (“unique visitors) per month. A sure sign that this site is being taken serious is an above-average number of requests of the WHO, the Centers for Disease Control and the European Centre Disease Prevention and Control.