Database Project To Track Health Of 100,000 U.S. ChildrenDatabase Project To Track Health Of 100,000 U.S. Children

The NIH-backed study will examine the effects of genes and environmental factors on the health of volunteer participants in 105 locations from before birth to age 21.

Marianne Kolbasuk McGee, Senior Writer, information

October 7, 2008

5 Min Read
information logo in a gray background | information

An ambitious new government study to follow the health of 100,000 U.S. children from before birth to age 21 is launching in January, and information technology for data collection and analysis is playing a central role.

The National Institutes of Health National Children's Study, which was authorized by Congress in the Children's Health Act of 2000, will examine the effects of genes and environmental factors on the health of American children based on volunteer participants in 105 locations, representing a composite of the U.S. population.

The study involves collecting genetic, biological, and environmental samples from children from across all racial groups, income and educational levels, and geographies, including rural, suburban, and urban areas. The goal of the project is to improve the health and well being of American children.

The study kicks off in January with a pilot project enrolling 1,250 infants from Queens, N.Y., and Duplin County, N.C. Then, during the next five years, approximately 1,000 babies from 105 locations in the United States will be enrolled in the study, including the recruitment of pregnant and not-yet-pregnant women. The study will investigate factors influencing the development of a range of conditions, such as autism, cerebral palsy, learning disabilities, birth defects, diabetes, asthma, cancer, violent behavior, and obesity.

Central to the study is the collection of data about the patients, including information prior to conception, from the deliveries of the babies, and all the way up to young adulthood, said Sarah Keim, deputy director for operations and logistics for the National Children's Study in an interview with information. Findings of the study will be released for publication in medical and clinical journals along the way, said Keim.

"We know that not all [children] will stick with the study, so we're recruiting more women than initially needed. The goal is to follow these children up to age 21," she said.

While the study received $110.9 million in federal funding in fiscal 2008, and $69 million in fiscal 2007, it's still uncertain how much funding the project will get next year, Keim said. Approximately 10% to 11% of the study's fiscal 2008 budget was allocated toward developing an information management system for the project, "which includes a combination of commercial off-the-shelf, government off-the-shelf, and custom code for best-of-breed solutions," said Keim.

"A major challenge is to provide an IT architecture that supports 105 study locations with additional field sites at each location that is FISMA-compliant and still reasonable easy to use," she said. FISMA, or the Federal Information Security Management Act, is federal security methodologies and guidelines, and is part of the E-Government Act of 2002.

"We use a .Net and a VPN communication with defense in depth to meet FISMA," she added. "The applications use best-of-breed modules from Westat, our coordinating center, and commercial off the shelf from a variety of vendors that are integrated by Booz Allen Hamilton, our IMS integrator," Keim said.

An inventory management system based on commercial off-the-shelf software from eCity will be used to track supplies and equipment used in the study, she said. The bulk of the study data will be collected during "study visits" with the children that will be more frequent with infants and babies and likely to become less frequent as children grow older, Keim said. Those visits will take place at about 105 locations across the United States, including participating university medical centers, as well as smaller study offices set up in more convenient locations for families.

"One of those visit centers is next to the only post office in town," Keim said. On other occasions, data from air quality and other environmental samples will be collected during visits to the children's homes. Keim estimates that there are close to 1,000 researchers and federal staff working on the study in teams across about 40 research centers, including medical schools, hospitals, public health departments, and nonprofit organizations.

The study visits "are not substitutes for usual medical care" children need from their pediatrician or other health care providers, Keim said.

As more doctors nationwide deploy electronic medical record systems in their offices, data from patient records will be used in the study as well. In the meantime, doctors, parents, and patients can update researchers on health issues during study visits, as well as information collected in other ways. For instance, pregnant women will be asked to track care and health in a log, Keim said.

Researchers will collect data using tablet PCs running Windows XP. Keim said it's expected that 5,000 tablet PCs will be used for the full study, or about 50 tablets per study center. Study staff will securely transmit tablet data to the central coordinating center via VPN.

Among the software being used for the study are applications developed by the Centers for Disease Control and Prevention and other federal agencies. That includes CDC survey-authoring software that allows researchers to quickly and flexibly create questionnaires. "That sounds simple, but it's actually very hard to do," Keim said.

Data will be stored in Oracle and SQL databases. The size of the study's databases will be "relatively small in the beginning but will ramp up over the study," she said. "We will add storage as needed. It is tiered based on access time. We do not have the final estimate, but the size of the central database will certainly be in the multiple-terabyte range," she said.

"Much of the data analysis will be via commercial applications like SAS or SPSS," said Keim. "There will also be software tools to analyze genetic data, likely to be adapted from previous genetics studies and originally developed by consortia like through the Human Genome Project. Also, mapping tools will be important for data analysis, like ArcGIS," she said. ArcGIS is a product from GIS software vendor ESRI.

The team of researchers involved with the study includes obstetricians, pediatricians, social scientists, neurologists, and psychologists. Preliminary results from the study's early years are expected to start in 2011. The study will tackle subjects ranging from how a child's genes and environment interact to promote violent behavior in teenagers, to whether exposure to allergens early in life can actually help prevent asthma from developing in children.

Read more about:

20082008

About the Author

Marianne Kolbasuk McGee

Senior Writer, information

Marianne Kolbasuk McGee is a former editor for information.

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like


More Insights