This study integrates the principles of pattern-based classification and Kaplan-Meier survival analysis to identify genes and clinical features associated with the rapid progression of chronic kidney disease. The methodology successfully determines the gene-gene survival interactions in the African-American Study of Chronic Kidney Disease with Hypertension (AASK) genomic dataset. The results obtained from this study serves as a basis for the future studies on comparison of the disease progression in white patients with that in African-American patients, both those with and those without apolipoprotein L1 (APOL1) high-risk variants.
Pattern-based Classification and Survival Analysis of Chronic Kidney Disease
Munevver M. Subasi, Melissa S. Moreno, Travaughn Bain, Megan Moreno, Ersoy Subasi, Katherine C. Carroll*, Emily R. Cunningham* and Michael Lipkowitz. ISAIM 2016 - International Symposium on Artificial Intelligence and Mathematics, Fort Lauderdale, FL. January 4–6, 2016. *Summer 2015 BioMath REU Interns
Island populations have been called “nature’s test tubes” (Losos and Ricklefs, 2009) because they are particularly amenable to examining patterns of evolution. Since Darwin's seminal book (1845) researchers have noted traits that are shared among many island taxa including gigantism, loss of flight, and tameness that are thought to be the result of strong selective pressures (Grant, 1998). Only recently, with the advent of next generation sequencing approaches and the subsequent development of bioinformatics tools, have researchers begun assessing the genetic basis of adaptation (Deagle et al., 2013). We will examine the genomics of song sparrow (Melospiza melodia) populations using 1,000 SNPs identified as being possible candidate markers for island adaptation in this species (Srivastava et al., 2012). Song sparrows found in the Aleutian Islands of Alaska exhibit several traits associated with island adaptation not found in mainland populations including gigantism (Fig. 1) and loss of migratory behavior (Pruett and Winker 2005). We will compare island and mainland locations and two closely related species (Melospiza georgiana and Melospiza lincolnii) to identify loci that are likely to be under selection in island populations. Identified loci will then be compared with the zebra finch (Taeniopygia guttata) and the forthcoming song sparrow genomes to map the locations and function of these regions. The genetic data for this project is in hand.
Proposed Research Activity: The biggest challenge encountered in the analysis of SNPs data is to decide which and how many SNPs (or genes in a microarray data) should be selected for further studies. In this project students will develop systematic procedures that take advantage of computer-related developments and advanced combinatorial optimization techniques for optimizing feature selection in the song sparrow SNPs data and for identifying the set of combinatorial patterns of SNPs that distinguish island sparrows from mainland populations. Students will be exposed to traditional statistical methods and more sophisticated techniques including support vector machines (Burges, 1998; Schölkopf, 2002), decision trees (Quinlan, 1993), neural networks (McCulloch, 1943; Minsky, 1969; Rumelhart et al., 1986), logical analysis of data (Hammer et al., 1986, 2006; Alexe et al., 2003-2006; Reddy et al., 2008; Subasi et al., 2012, 2013), and emerging technologies such as web services and grid computing. Ultimately, loci identified as being functionally important in island adaptation will be examined in other species of birds in the Aleutian Islands that also exhibit gigantism (Gibson and Byrd) such as rock ptarmigan (Lagopus muta), common raven (Corvus corax), and Pacific wren (Troglodytes pacificus).
• Dr. Robert van Woesik (Biological Sciences)
Coral reefs support the most diverse marine assemblages on Earth. Not only are coral reefs spectacularly beautiful systems, they also provide food resources and environmental services. Indeed, coral reefs provide fishes for millions of people living adjacent to reefs, but they also act as barriers that protect island nations from large-storm waves. The recent increase in CO2 in the atmosphere, however, through the high emission of fossil fuels has increased the temperatures of the atmosphere and increased the temperatures of ocean surface waters. In combination these temperature increases are increasing global sea level. Island nations are particularly vulnerable to such climate change, particularly islanders that live less than 1 m above modern sea level. Over the next century, the climate is predicted to drive water temperatures even higher than today, and considerably above temperatures experienced by reef corals for over the last 700,000 years. Whether coral reefs can ‘keep up’ with continued sea-level rise is largely dependent on i) future rates of sea-level rise, and ii) future responses of reef growth in a warmer ocean. Therefore the key research question here is: where will coral reefs continue to thrive and be able to keep up with sea-level rise in a warming ocean? This project will examine and develop new models of coral-reef growth.
Coral reefs have in the past been able to keep up with sea level rise; if a reef is depauperate of corals it will not be able to keep up with sea level rise. We will consider and model whether reefs can ‘keep up’ with predicted sea-level rise in the near future. We will compare estimated vertical extension rates with rates of sea-level rise under four representative concentration pathways. Coral-reef growth can be considered as a simple system of reef growth minus erosion. Through geological time, reef growth has only occurred when the local production of calcium carbonate exceeds the local destructive processes. Production depends on the densities of organisms that are able to accrete calcium carbonate; whereas destructive processes are a function of the rates of biological, physical, and chemical erosion. We will try and capture the main processes that are involved in coral-reef growth and model the system and predict rates of change. These growth rates will be compared with global climate models that predict changes in ocean temperature and sea-level rise.
Students will have the opportunity to develop their methods in a modern computing environment at Florida Institute of Technology. First students will learn standard statistical analysis techniques using R, and will develop new methods to investigate the relationship between modern and past coral-reef processes. The students will then investigate potential reef-growth models and predicted climate change predictions. The problems will be addressed through team discussion, experimental design, and study of the existing methods. Students will also participate in discussions to address how the findings can be used toward development of new generation of applications. They will also practice how to share their ideas, think analytically, and go through the decision making process as a team. This project would provide them a balanced combination of theoretical study, practical algorithm design, and teamwork.
Sea level changes
Changes in sea level are a direct consequence of ice ages; during an ice age large water masses are stored as ice at the poles. There were no major ice ages before the Pleistocene epoch (1.6 my BP) because the world was probably too warm to be easily pushed into an ice-age. The earth cooled through the Cenozoic era (65 my BP) and at the start of the Pleistocene was ready to freeze. The Pleistocene was colder than today, and every now and then was nudged into glaciation due to fluctuations in the path and position of the earth. Changes in the Earth's position relative to the sun have profound effects on the earths climate, and may indeed provide the final push toward ice ages.
The earth's relationship with the sun changes in 3 ways: (1) in its orbit (eccentricity, or the path it follows around the Sun in 1 year), the elliptical circuits vary in shape, sometimes the ellipse is almost a circle (cycle from circle to ellipse is 95,800 years); (2) in its tilt (inclination, or obliquity -the angle of its axis of spin), it varies from 21.39o to 24.36o (cycle 41,000 years); (3) and in its precession (or wobble), a wobble is approximately 21,700 years.(1) In total these three effects are called Milankovich cycles (after Milutin Milankovich in the early twentieth century). The constant flipping from ice age (glacial) to interglacial throughout the Pleistocene is related to the Milankovich cycles. These cycles are roughly 100,000 years.
Moreno glacier, Patagonia, South America (photo by Col Whittingham)
Ice ages and consequences
Ice ages cause drops in what is referred to glacio-eustatic sea levels (where eustatic is a term used to define a global effect). During these times sea level can drop as much as 130 m during the glacial maximum. Recently, McCulloch et al. (1999) established that the sea surface temperature of tropical Indo-Pacific water, more specifically around Papua New Guinea, in the penultimate (i.e., one before the final) deglaciation period (1300,000 years ago), when sea level was 60 m below today's level, was approximately 7 degrees Celsius lower on average than it is today (at 29oC). When sea levels rise, because of ice cap melting, two lots of non-concurring models suggest what happened next. The first model argues for a gradual rise (Bloom, Chappell and Shackleton 1986), and the second model suggests a two-step process (Fairbanks 1989). Fairbanks argues that deglaciation occurred with maximum melting rates from 14,000 to 12,000 years BP, and from 10,000 to 7,000 years BP separated by a mid-deglacial pause with no ice loss (Fairbanks 1989). It appears that Fairbanks was right.
The rise in sea level is generally referred to as the transgression (when sea level falls it is classified as the regression), that only occurs during an inter-glacial period as we have today. The transgression allowed seas to again inundate the continental shelves from their low stand. The geological period in which the transgression occurred is called the Holocene (or recent) and began some 10,000 years ago. It stabilised to modern level at around 6 000 years BP, but slowly rose +1.5 m around 2 500 years BP and quickly subsided again to modern levels.
Geology and Geomorphology
This essay does not cover types of reefs, for that information see Hopley (1982). With a rising sea level, coral reefs started growing during the onset of the transgression over antecedent foundations (old reef structures), or karsts. The reefs we see today are built in the Holocene, but many are merely a veneer covering old reefs that were deposited in earlier times (for example during the Pleistocene). Most coral reefs growing today are growing on substrates that stood above sea level many times throughout the Pleistocene(2).
Early reef growth through the transgression produced "juvenile" reefs, growing toward sea level. They evolved through vertical growth to form "mature" reefs, with reef flats at modern sea level. Further infilling of reefs with lagoons occasionally terminated their growth; these reefs are typically referred to as "senile" reefs. The two most important parameters(3) determining reef morphology are (1) the morphology and depth of the pre-Holocene reefal surface; (2) the net rate of reef accretion (Hopley 1983). Prior to Hopley's paper in 1983 reef morphology was thought to be largely a consequence of the molding action of wind, waves and currents. This may indeed be the case once the reef has reached sea level, however other factors are dominant before this stage.
We will briefly examine the Great Barrier Reef (GBR) as an example of the general variation in reef development. In the GBR region, outer reefs, that are exposed to high wave energy regimes, tend to be framework reefs supporting mainly framework corals (branching and massive), whereas the inner reefs tend to more detrital (composed of rubble, sand and mud). Outer reefs also have much higher levels of cementation (by coralline algae), whereas the inner reefs have virtually none.
Inshore fringing reefs are common on many continental islands close to the mainland shore. The best developed of these reefs typically have both framework and detrital elements and distinctive reef flats and reef slopes. The least developed are termed 'incipient reefs, essentially detrital banks without reef flats, that are colonised by hard corals, and usually with other sessile benthos such as macro-algae, soft corals and zoanthids. In addition, there are coral communities without appreciable framework or detrital accumulations on the rocky flanks and headlands of the islands.
For many Great Barrier Reef reefs, reef growth commenced between 8,320 and 7,500 y BP (Hopley 1982). On many inshore reefs the entire accretion of reef structure has taken place during the last 6,000-5,000 years. A time lag between reef initiation and the transgression may have been due to terrigenous influences and high turbidity on a shallow shelf. Chappell et al. (1983) proposed two models for the development of fringing reefs with reef flats; one for those which had reached their current widths soon after sea level peaked 6,000 - 5,000 y B.P., and one for those whose seaward edge has gradually advanced across the sea floor since that time. In both models, coral growth on the reefs outer margin is a primary source of the skeletal elements comprising the reef matrix. The average growth rate for reefs has been reported at 4-5 mm per year, although incipient reefs grow at a rate of < 1 mm per year.
In the Ryukyus Islands of Japan, Holocene reefs began forming on old carbonate foundations some 8,500 to 8,000 y BP. Research has been undertaken primarily on, Kikai Island, Kume Island, Yoron Island, Okierabu Island, and Minna Island. Kikai Island and Kume Island are uplifted intermittently therefore the story there is more complicated. A typical reef at Yoron Island appeared to have grown at 1-3 mm a year, and reached modern sea level about 5,000 years ago. Holocene deposits range from 3.0 to 15.0 m. Okierabu and Minna are quite contrasting reefs, the former was constructed only by upward growth and reef flat formation, while the latter started as a barrier that later infilled. Therefore, the former is only narrow while the later is quite wide. Reef growth can be divided into three stages: the initial stage of upward growth, the second stage of reef crest formation, and the third stage of formation of other reef features, such as spurs and grooves and reef flats (Kan et al. 1995).
Reef growth through the Holocene has added a veneer to most reefs grown over pre-existing fossil reefs, but have contributed almost entirely to fringing reef growth on the Great Barrier Reef. Differences in degree of coral reef development are a consequence of time-varying differences in the rates of production versus destruction, and of accumulation versus dispersal, (of the skeletons of reef-building organisms,) throughout the time available for the reef to grow. In other words, the process of reef building is by (1) biological accretion (by corals and calcareous algae) and by (2) sedimentological accretion (by deposition of reef associated organisms) and is (3) tempered by destructive processes, such as through the action of parrot fish, echinoids, infaunal sponges and polychaete worms, and by other natural and human impacts. In summary, net reef growth is the sum of framework accretion, sedimentological accretion and destruction.
Biological coral reef accretion is the consequence of coral and algae growth. This can be described in terms of the demography of calcifying organisms: their densities and frequency of settlement; their rates of growth; their mean longevity's and maximum sizes; their fates following death. The increasing mass and decreasing surface area to volume ratio which accompanies aging of reef building corals are both conducive to retention of large reef building blocks on site, and the progradation of the reef across the sea floor. By contrast, short-lived colonies have a larger surface area to volume ratio at time of death, and are thus more prone to be disintegrated and dispersed by bio-eroders and by breakage, waves and currents. Reefs thus become well developed as a consequence of high settlement densities and great longevity maximising retention of large framework elements on site. They fail to develop or to progress beyond the 'incipient stage, if either or both of settlement densities and longevities are too low to counteract the bioerosional and physical losses.
On any one particular reef there are locations that are either sources or sinks of reef sediment. The outer, windward reef edge is usually the prime source. Lagoons and reef slopes, or sub-tidal and intertidal rubble banks on reef flats, are usual sink locations. At least half of the material initially incorporated in the reef (gross production) may be exported from the reef. On reefs where export is limited the reef may prograde backward by the addition of sand sheets.
"Turn ons" and "turn-offs"
Reef development can be interpreted in terms of 'turn-ons' and 'turn-offs'. Clues to the nature of the turn-on/ turn-off mechanisms may be provided by interpretation of patterns in community structure and the physical environment. Reef growth is 'turned on' where physical conditions are suitable reflected in the high settlement densities of corals, and long individual life expectancies which lead to large colony sizes, especially in massive and branching forms. Here, it appears that individual colonies persist longer and accrete more often into existing framework even in spite of periodic natural perturbations. In contrast, sparsely covered and poorly developed reefs appear to be 'turned off'. This is usually the case in harsh physical environments. The biological community clearly reflects the harsh conditions by showing a high population turnover, high rates of skeletal decay, and a narrow euphotic zone. These factors may combine to limit the accumulation of reef framework and detritus necessary for reef accretion. Fast colony growth may offset negative effects of a sparse and ephemeral coral community in terms of production of reef framework and rubble. However, short life expectancies and low recruitment densities, in combination, are not conducive to the accumulation of a carbonate matrix. Upon death, the small, lightly calcified and poorly cemented skeletons are more prone to disintegration by bio-eroders, and to breakage (Van Woesik and Done, 1997).
The navigation of microorganisms through their fluidic environment is critical for many biological processes. In a successful fertilization, spermatozoa must swim though cervical mucus, as well as eventually penetrate the cumulus layer of the oocyte complex in mammals and in organisms that spawn, the spermatozoa will encounter the different viscosities of sea water and the egg jelly that immediately surrounds the egg (Wassarman, 1987). Disease-causing bacteria such as Escherichia coli and Salmonella typhimurium revolve in intestinal fluids by rotating their helical flagella (Lauga and Powers, 2009). Understanding the physics and biology underlying microorganism or unicellular motility is one of the most important goals in the research of biological physics and cell biology. Recently, further interests have been raised in this area, motivated by the development of individual micro- and nanoscale artificial swimmers for applications including targeted drug delivery and robotic surgery. Swimming problems in a Stokesian Newtonian fluids have been extensively studied and the underlying physics is well understood. On the other hand, many biological fluids with suspended microstructures exhibit complicated non-Newtonian mechanical responses. The fundamental mechanisms of microorganism motility in a complex biological fluid are only beginning to be uncovered. Previous research has treated the polymeric fluid as a single-phase continuous medium. However, for some complex materials, such as gels, there may be relative motion differences between the polymer network and the water, and then single-phase models for such materials may not be appropriate.
Proposed Research Activity: Based on the extensive expertise and previous developments of the group, the proposed REU activities will concentrate on the locomotion problems in complex fluids through a combination of analytical, computational and experimental tools. Specifically, we plan to:
Cancer is a ubiquitous problem that affects millions of people yearly (US Cancer Statistics Working Group, 2013). Recently, studies have focused on identifying the heterogeneity in discrete cancer cell populations and the relevance this heterogeneity may have in ability to predict or model efficacy of standard of care therapies (Liu et al., 2013). Historically, cancer cell growth has been theoretically modeled in a variety of ways, including simply using an exponential model, or modified versions of exponential models that account for a plateau in cell growth (Yorke et al., 1993). More recently, models of cancer growth have incorporated other cellular processes that might influence cell division (Boman et al., 2007; Lander et al., 2009), but most models still rely upon assumptions of a constant division rate. In this project, we will incorporate different processes, including stochastic and non-stochastic mechanisms, to model the proliferation of cancer cells and a variety of tumor cell lines, focusing on those cells that are amenable to assessment of the models in laboratory experiments.
Proposed Research Activity: Students will be involved from the very beginning to develop the model, test the model in the lab, and the using their own data, refine the model to reflect the growth of specific cell types. The initial base model will be a differential equation for cell growth rate proposed for generalized mammalian cells (Qu et al., 2004).
dm/dt = k¬1[S6]R - k2m Equation 1
where, m = total biomass of the cell; [S6] = phosphorylated ribosomal S6 kinase (a proxy for protein synthesis activity); R = ribosome concentration (a proxy for the cell capacity for protein synthesis); k1 = rate constant, cell growth; and k2 = rate constant, cell death. This model links the processes of cell growth to cell division. These parameters and constants will be measured in our cell systems to provide a baseline. This initial base model will then be modified to introduce the idea of a 'random walk' process for the cancer cell.
In the first model, a generalized random walk process will be introduced to simulate the non-synchronized cell divisions in a population of cells. A random walk is a stochastic process made up of a series of steps, each of a distinct length. Incredibly, the idea of 'random walk' was introduced over 100 years ago and it is still utilized to modeling many different processes, including applications in areas as different as traffic control, manufacturing, physics, chemistry and biology (Pearson, 1905; Weiss, 1994). The random walk describes the behavior or the action of the participant in the defined process. For example, for an individual dividing cell, the random walk would assign a value to the cell division state - either dividing or not dividing - that is independent of the past history of the cell. However, for populations of asymmetrically dividing cells, the model would become more complicated and make predictions about the behavior of a defined group of cells.
Students will be involved in developing and understanding the mathematical model before beginning the biological experiments. This will allow them to gain in experience on the most vital but difficult aspect of modeling biological systems; that is, how to take the biological data and translate it into a form that can be input into the model. After developing the initial model, the students will utilize several different cells lines to collect data on growth rates under different culture conditions, including growth in the presence of specific inhibitors. The biological parameters that will be considered include: cell numbers over time, quantification of DNA replication, expression of marker proteins, and sensitivity to common agents shown to affect tumor cell growth such as inhibitors of kinases (MEK, RAF, BRAF, MAPK) epidermal growth factor, etc., dependent on the type of tumor cell being modeled. As data are gathered, it may become necessary to alter the mathematical models or even increase the model complexity. The model will also be used to compare the growth of the cells in the presence or absence of specific inhibitors. Overall, the students will be exposed to all levels of developing the model: from developing a model that can be tested and verified, to gathering the biological data, to application of the theory and analysis of experimental model outcomes.
Advances in digital capture devices and computing power have facilitated the automated collection of image libraries substantial in both number and file size. Such libraries have the potential to provide large data sets that were previously difficult to access. However, manual methods for analyzing such libraries are generally so onerous (sometimes impossible) that automated methods are in high demand to maximize the utility of these images and provide results within a reasonable time period. Indeed, automated analysis is increasingly being utilized to refine our understanding of a variety of biological phenomena in fields including proteomics, genomics, and eukaryotic development (Klevecz, 2000; Liang, 1993). While several image analysis methods have been developed to detect, localize, and track indicator signals in digital image sequences, these methods achieve different rates of success depending on the imaging modality and biological event of interest.
Developing a “one-size-fits-all” approach to automated image analysis is laudable; however it is impractical due to the fundamental differences in imaging modalities and biological studies. Unfortunately, the current pace at which new algorithms are developed for case-by-case studies is significantly slower than the rate at which advances in automated image acquisition technology is occurring. This gap is a bottleneck in the research process that can limit both the quality as well as maximum utility of a collected dataset.
One example of this imaging problem is in efforts to characterize the root system architecture (RSA) of plants. The RSA can provide valuable insight into nutrient availability, phytohormone utilization, the roles of specific genes, and more (Osmont, et al., 2007). However, the successful analysis of RA variations can involve libraries with hundreds to thousands of images of variable quality with only slight variations in root length, the number of lateral roots, length of the root elongation zone, etc. The manual analysis of the simplest RA studies, investigating a single feature and (100 seedlings, are often time consuming and prone to error. Many automated commercial software packages while effective, are still surprisingly expensive and often tied to single instrument (scanner) usage licenses limiting their utility/portability and value as a teaching and research tool (Arsenault et al., 1995). Open-source or other free software packages typically offer limited automation, report on only one or two features of the RSA, and may require sample preparation not amenable to many growth formats (French et al., 2009). Proposed Research Activity: The solution to the image analysis problem lies in an interdisciplinary approach in which biologists and mathematicians coordinate their efforts. This solution requires not only advances in image analysis, but the design of new experimental methods, the output of which (the images) are more amenable to automated analysis. We propose to expose students to this emerging area of research through the analysis of the RSA of the model plant system Arabidopsis thaliana in response to bacterial secondary metabolites. The ultimate goal of the project will be to design a suite of automated image processing techniques and complementary experimental procedures for RSA analysis using portable digital image capture devices such as cell phone cameras. The RSA Image analysis teams (students and mentors) will work together over the ten weeks on three separate modules. Module one will introduce students to conventional methods for image acquisition and analysis as well as basic features of the RSA of Arabidopsis. In module two, the students will explore how these methods can be applied to address questions in biological research with a special emphasis on RSA. Finally, in module three students will attempt to develop new image analysis techniques and complimentary experimental methods to facilitate the analysis of RSA in their samples. Each summer, the student team will be given a specific problem they will try to resolve such as, “How do you determine the number and length of lateral roots on the plant?” Solutions proposed, developed and evaluated by student teams can then be developed further over the course of the regular academic year for implementation in research and teaching labs. In addition to their utility in the analysis of RSA these programs will serve as a positive example of the REU’s program goals and the value of specialized programs in BioMathematics.