Category Archives: Success Stories

Projected changes in solar UV radiation in the Arctic and sub­Arctic Oceans: Effects from changes in reflectivity, ice transmittance, clouds, and ozone

Scientists from the Laboratory of Atmospheric Physics (Department of Physics, Aristotle Univ. of Thessaloniki) have investigated the changes of the solar UV irradiance in the present (mean levels for the period 2005­-2015) and in the future (mean levels for the period 2090-­2100) compared to the past (mean levels for the period 1950­1960). The study was focused mainly in the Arctic and the sub­Arctic Oceans. The derived changes in UV irradiance were attributed to the corresponding changes in the surface reflectivity, the total ozone column and the cloudiness. The projected levels of the UV­B irradiance, the UV­A irradiance and the UV index on the surface and of the UV­B irradiance that is transmitted into the ocean were quantified.

The spectral solar irradiance reaching the Earth’s surface was calculated using the cdisort, pseudospherical approximation (Buras et al., 2011) of the radiative transfer model UVSPEC, which is included in the version 1.7 of the libRadtran package (Mayer and Kylling, 2005). The simulations were based on inputs from four Earth­System Models, which participated in the fifth phase of the Coupled Model Intercomparison Project (CMIP5) (Taylor et al., 2011), and on climatological data for aerosols (Kinne et al., 2013). Simulations were performed for a standard 10°×2.5° (longitude × latitude) grid and for two different socioeconomic scenarios, corresponding to different GHGs emissions in the future.

Results of this study suggest that, for specific months and grid cells, the monthly mean UV index on the surface can be up to 40% lower in the future compared to the past (figure 1), while the monthly mean UV­B that enters the ocean can be up to 10 times higher (figure 2).

A large part of the simulations was performed using on the EGI and the HellasGrid infrastructures with the support provided by the Scientific Computing Services Office at the Aristotle University of Thessaloniki. In the researchers’ own words:

“Using these resources enabled us to drastically limit the time required for the simulations.”

Figure 1. Multimodel monthly mean noon UVI changes (in %) for 2090–2100 relative to the                               1950–1960 mean. Clear­sky changes are shown in (a–c) RCP 4.5 and (d–f ) RCP 8.5 for April,                                   June, and August, respectively. All­sky changes are shown in (g–i) RCP 4.5 and (j–l) RCP 8.5.                                 Simulations are shown only for ocean­covered areas (Figure after Fountoulakis et al., 2014).
Figure 1. Multimodel monthly mean noon UVI changes (in %) for 2090–2100 relative to the  1950–1960 mean. Clear­sky changes are shown in (a–c) RCP 4.5 and (d–f ) RCP 8.5 for April, June, and August, respectively. All­sky changes are shown in (g–i) RCP 4.5 and (j–l) RCP 8.5. Simulations are shown only for ocean­covered areas (Figure after Fountoulakis et al., 2014).
Figure 2. Ratio of the monthly mean noon UV­B irradiance that is transmitted into the ocean in the future (2090–2100 mean) relative to the past (1950–1960 mean) for April, June, and August and for the scenarios (a–c) RCP 4.5 and (d–f) RCP 8.5 (Figure after Fountoulakis et al., 2014).
Figure 2. Ratio of the monthly mean noon UV­B irradiance that is transmitted into the ocean in the future (2090–2100 mean) relative to the past (1950–1960 mean) for April, June, and August and for the scenarios (a–c) RCP 4.5 and (d–f) RCP 8.5 (Figure after Fountoulakis et al., 2014).

References:

  • Buras, R., T. Dowling, and C. Emde (2011), New secondary-scattering correction in DISORT with increased efficiency for forward scattering, J. Quant. Spectros. Radiat. Transfer, 112(12), 2028–2034, doi:10.1016/j.jqsrt.2011.03.019.
  • Fountoulakis I., Bais, A. F., Tourpali, K., Fragkos, K., and Misios, S. (2014), Projected changes in solar UV radiation in the Arctic and sub-Arctic Oceans: Effects from changes in reflectivity, ice transmittance, clouds, and ozone, J. Geophys. Res. Atmos., 119(13), 8073–8090, doi: 10.1002/2014JD021918
  • Kinne, S., D. O’Donnel, P. Stier, S. Kloster, K. Zhang, H. Schmidt, S. Rast, M. Giorgetta, T. F. Eck, and B. Stevens (2013), HAC-v1: A new global aerosol climatology for climate studies, J. Adv. Model. Earth Syst., 5, 1–37, doi:10.1002/jame.20035.
  • Mayer, B., and A. Kylling (2005), Technical note: The libRadtran software package for radiative transfer calculations – Description and examples of use, Atmos. Chem. Phys., 5(7), 1855–1877, doi:10.5194/acp-5-1855-2005.
  • Taylor, K. E., R. J. Stouffer, and G. A. Meehl (2011), An overview of CMIP5 and the experiment design, Bull. Am. Meteorol. Soc., 93(4), 485–498, doi:10.1175/bams-d-11-00094.1.

 

Probing the properties of materials with ab initio quantum-mechanical calculations

The use of so-called first-principles (or ab initio) simulations of the formation and dynamics of materials has a prevalent role in modern research for various fields of physics and chemistry. The HellasGrid infrastructure has allowed members of the Computational Condensed Matter Physics and Materials Science Group (NTUA-CCMP) of the Physics Department at the National Technical University of Athens to perform systematic and extensive ab initio studies on a number of materials that show great potential for technological applications. The calculations have utilized the state of the art approach of density-functional theory (DFT) which enables the solution of the many-body electron problem in extended systems with high accuracy.

Graphene and graphene-like two-dimensional (2D) materials stand out as prominent members of the rapidly expanding family of nano-materials. Using DFT calculations the NTUA-CCMP group has probed [1] [2] the atomic-scale mechanisms that facilitate the formation of hydrogenated graphene, also known as graphane, a wide band gap semiconductor that, in principle, could be combined with graphene in all-carbon electronics. Similar recent DFT studies [3] have examined the formation and properties of complex 2D hydrogenated silicene and germanene (the silicon and germanium analogues of graphene) and showed that these materials could have interesting applications in nano-electronics and nano-mechanical systems. Moreover, DFT calculations analyzed the structural details of mono-layer films of silicon and germanium, either on metallic substrates, [4] or as free-standing ultra-thin films [5].

tsertseris

Figure 1: Spin densities in a TiB2/MnB2 superlattice with one MnB2 layer and 5 TiB2 layers in the unit cell. Magnetizations of successive MnB2 layers are parallel and antiparallel in the top and bottom panels, respectively.

In the same spirit, extensive ongoing ab initio investigations [6],[7] have targeted the physical characteristics of complex structures of metallic di-borides, a large class of quasi-2D materials. These investigations have already identified two key features of di-boride systems, the presence of the so-called inter-layer exchange coupling in magnetic superlattices (Fig. 1) and the stability of nano-columns in TiB2 with excess of boron. Both effects could lead to important applications in, for example, magnetic recording, spintronics, and low-dimensional electronics.

Finally, DFT simulations [8],[9] have been employed to study transformations of PCBM crystals and the dependence of their electronic properties on impurities. PCBM is a fullerene chemical derivative with a bucky-ball C60 core and a functional tail. Despite the fact that PCBM molecules are renowned for their high-efficiency as electron acceptors in organic photovoltaics (OPV), the type of crystals that these molecules form remained elusive. Systematic DFT investigations [8] of continuous PCBM crystal transformations addressed this open issue and identified several locally stable types of PCBM crystals. In addition, recent ab initio studies [9] described the particulars of oxygen and water insertion in PCBM crystals and the role of these impurities as degradation agents in PCBM–based OPV systems.

Contact Details:

  • L. Tsetseris, Assistant Professor, NTUA, leont (at) mail.ntua.gr

References:

The importance of GRID computing in the investigation of climate

Climate change is unequivocal, as is evident from observations of increases in global average air and ocean temperatures, widespread melting of snow and ice and rising global average sea level (IPCC, 2007). Since climate change is concerned with important societal issues, it is very important to assess impacts of climate change already underway and address adaptation strategies to reduce vulnerability and risks of climate change.

Climate models use mathematics and the laws of physics to simulate the interactions of the basic components of the climate system. Differential equations are used to relate fundamental physical quantities (e.g. temperature, pressure, wind etc) to each other.  Each equation is solved at discrete grid points on the earth’s surface, at a fixed time interval (time-step) and several vertical layers, defined by the regular three-dimensional grid. Horizontal resolutions of global climate models range between 100-200 Km while of regional climate models from 10 to 50 Km.

In the Department of Meteorology and Climatology at the Aristotle University of Thessaloniki high resolution (10 Km) transient (1961-2050) climatic simulations were performed over South Eastern Europe with the regional climate model RegCM3 (http://gforge.ictp.it/gf/project/regcm/) using the HellasGrid resources within the framework of the ongoing project Geoclima.

The simulations were performed under the IPCC A1B scenario (http://www.ipcc.ch/ipccreports/tar/wg1/029.htm). Projected near-surface temperatures staring from present climate until the middle of the 21st century are shown in Figure 1. 

image001

Figure 1 – Evolution (1961-2050) of near-surface temperature over South Eastern Europe simulated in AUTH using the Hellas-Grid computational resources

The final aim of Geoclima (www.geoclima.eu) project is to develop a Geographical Information System (GIS) allowing the user to visualize, manage and analyze the information which is directly or indirectly related to climate and its future projections over SE Europe. 

Contact details: 

  • H. Feidas (PI), Associate Professor, AUTH, hfeidas (at) geo.auth.gr

  • P. Zanis, Assistant Professor, AUTH, zanis (at) geo.auth.gr

  • E. Katragkou, Lecturer, AUTH, katragou (at) auth.gr

  • Scientific Computing Center, AUTH, contact (at) grid.auth.gr

References

  1. IPCC, 2007: Climate Change 2007: Synthesis Report. Contribution of Working Groups I, II and III to the Fourth Assessment Report of the Intergovernmental Panel on Climate Change [Core Writing Team, Pachauri, R.K and Reisinger, A. (eds.)]. IPCC, Geneva, Switzerland, 104 pp.

HECTOR: Enabling Microarray Experiments over the Hellenic Grid Infrastructure

Scientists from National Hellenic Research FoundationUniversity of Aegean, National Technical University of Athens and Athens Information Technology have used the HellasGrid Infrastructure and the EGI Grid infrastructure in order to solve problems coming from the areas of computational biology, medical imaging and distributed systems. The goal of this group which developed the HECTOR web application is to implement tools for biological data analysis over parallel and distributed systems in order to facilitate the vast amount of processing and storage resources of the HellasGrid infrastructure.

In the area of bioinformatics, research efforts are drawn to whole genome functional genomics studies, through the use of DNA microarray high-throughput technology, that is the standard experimental technique over the last six years for the study of whole gene in an organism. Experiments of this type however, have a very high potential impact and are therefore adopted by a large and ever increasing number of laboratories. Laboratories and research groups are nowadays starving for powerful computational methodologies for processing microarray datasets, supporting through friendly interfacing functionalities related to array fabrication, labeling, hybridization, and data analysis. Such tools are a key prerequisite for effective data analysis that could automate and ultimately provide new insights for the transcriptomic component of the biological systems investigated.

The problem that arises with the ever growing usage of the microarray technology is the large amount of data produced everyday and the drawback of data preprocessing and statistical selection algorithms that demand high computational resources, making single threaded applications time consuming.

By utilizing the grid infrastructure, HECTOR [1] team managed to overcome the computational burden of such single node applications. The starting point was a set of legacy MATLAB applications, the ANDROMEDA (Automated aND Robust Microarray Experiments Data Analysis) [2], originally developed by researchers of the National Hellenic Research Foundation (NHRF), a new parallel statistical analysis pipeline was created.

The initial step was the transformation of all needed MATLAB functions to an open source derivative, OctaveForge that is installed on HellasGrid nodes in order to perform the data analysis. Accordingly all the initial data parsing functions were implemented from scratch in Python scripts to further speed up the operation. The pipeline was separated in different phases of parsing-preprocessing, normalization, statistical analysis and post-processing and was studied for potential parallelization. Due to the large computational needs and the ability to run independently from each other, the first two phases were designed to run on multiple nodes as seen in Figure 1. The use of MPI technology provided the ability to implement the whole workflow of the application and orchestrate the execution on multiple nodes of a site and monitor their time and possible errors. After the completion of the parallelized part, data are sent to the head MPI node for further statistical processing that need the existence of all experiment related data.

In HECTOR platform all this processing workflow is fully automated and provides a JSP based user friendly web portal that enables scientists/users from the fields of biology and medicine to use the processing power of grid without any specialty on informatics science. After the completion of statistical analysis, users are notified to get their resulting list and to further annotate their experiments with the use of MIAME XML platform implemented, and to decide whether they want to make the results public or not through the distributed database of HECTOR platform that supports the HellasGrid  storage elements.

Figure 1 - HECTOR platform

Contacts

  • I. Maglogiannis, University of the Aegean, Samos, Greece, imaglo (at) aegean.gr
  • A. Chatziioannou, National Hellenic Research Foundation, Greece, achatzi (at) eie.gr
  • I . Kanaris, University of the Aegean, Samos, Greece, kanaris.i (at) aegean.gr
  • V. Mylonakis, National Technical University of Athens, Athens, Greece, vmil (at) netmode.ntua.gr
  • J. Soldatos, Athens Information Technology, Athens, Greece, jsol (at) ait.edu.gr

References

  1. I. Maglogiannis,  A. Chatzioannou,  J. Soldatos,  V. Mylonakis,  J. Kanaris,  “An Application Platform Enabling High Performance Grid Processing of Microarray Experiments”, In Proc. 20th IEEE Conf on Computer Based Medical Systems CBMS2007 pp. 477-482, Maribor Slovenia
  2. J. Soldatos, I. Maglogiannis, A. Chatzioannou, V. Mylonakis, J. Kanaris, ‘Application Architecture for High Performance Microarray Experiments over the Hellas-Grid Infrastructure’, EGEE User Forum, Manchester, United Kingdom, May 9-11, 2007.
  3. Kanaris, V. Mylonakis, A. Chatziioannou, I. Maglogiannis, and J.Soldatos, “HECTOR: Enabling Microarray Experiments over the Hellenic Grid Infrastructure,” J. Grid Comput., vol. 7, no. 3, pp. 1–22, Aug. 20

GRISSOM Platform: Grids for In Silico Systems Biology and Medicine

Scientists from University of Central GreeceNational Hellenic Research Foundation and University of Aegean have used the HellasGrid Infrastructure and the EGI Grid infrastructure in order to solve problems coming from the area of  bioinformatics.

Transcriptomic experiments perform global gene expression monitoring, enabling thus, thorough probing of the in-vivo cellular state and its regulation, in healthy and disease state, in response to numerous environmental stimuli, across different species, etc. DNA microarrays have become a mainstay for a vast range of genomic applications, helping to identify significant alterations in transcriptomic expression of the system investigated, and map them to specific phenotypic outcomes.

There is a pressing request for computationally intelligent solutions, which manage to provide versatile, powerful and user-friendly data mining functionalities, in order to tackle the enormous underlying complexity of gene profiling experiments. On the other hand, there is an ever growing need for computational power as the size of the experimental datasets keeps increasing.

In Figure 1, an overview of the workflow structure of GRISSOM [1] is illustrated. The platform has been designed in order to effectively accommodate the needs of a wide range of users with different levels of expertise, aspiring to perform versatile and varying series of operations. The core of the developed web application, namely the quantitative signal processing and statistical analysis of the microarrays, which represent the computationally expensive part of the analysis pipeline, but also the storage of the datasets as well as of the annotation files, are exploiting the HellasGrid infrastructure. Overall, the DNA microarray experimental data analysis tasks implemented within the platform, encompass diversified processing steps, entailing versatile, heterogeneous in nature of processing, data type and complexity tasks. These can be basically partitioned into the categories of data import, gene selection, gene annotation tasks (gene, platform, and experiment), integrative interpretation capabilities, secure database storage and maintenance, and support of various output formats.

Figure 1- Workflow structure of GRISSOM

With respect to the efficient interpretation of DNA microarray experiments, GRISSOM supports gene classification based on clustering algorithms or cellular pathway analysis, through the integration of statistical ranking of annotated genomic experimental results. In this way, statistical enrichment analysis is performed, which exploits controlled biological vocabularies like the GO or the KEGG Ontology. Another capability of GRISSOM is the reconstruction of cellular network super-pathway models, which are SBML-compliant by exploiting KEGGConverter that is based on the KEGG pathway IDs derived from the analysis performed by StRAnGER.

Grids represent extremely heterogeneous, in terms of resources, tasks, policies and time demands, environments, posing sheer challenges regarding the effective accommodation and routing of all these striving requests. In order for GRISSOM being able to use as much processing power as possible, and keep manageable the queuing time too, the application has been designed to enable distributed computing methodologies, for two different grid configurations: a) through job schedulers that perform supervised job management in the Grid, as derived by a special directed acyclic graph (DAG), written in Python and Octave Forge mathematical language and b) by utilizing MPI computing workflows. DAG management renders the system resilient even for huge datasets, which can be executed even when the grid infrastructure is extremely loaded and has minimal resource availability. An Overview of how the web application resides between users and the HellesGRID infrastructure is shown in Figure 2.

Figure 2 - Overview of how the web application resides between users and the HellasGrid infrastructure

Contacts

  • I. Maglogiannis, University of Central Greece, Lamia, Greece, imaglo (at) ucg.gr
  • A. Chatziioannou, National Hellenic Research Foundation, Greece, achatzi (at) eie.gr
  • I. Kanaris, University of the Aegean, Mytilene, Greece, kanaris.i (at) aegean.gr
  • C. Doukas, University of the Aegean, Mytilene, Greece, doukas (at) aegean.gr
  • P. Moulos, National Hellenic Research Foundation, Greece, p.moulos (at) eie.gr
  • F. Kolisis, National Hellenic Research Foundation, Greece, kolisis (at) eie.gr

References

  1. GRISSOM Platform: Enabling distributed Processing and Management of Biological Data through fusion of Grid and Web Technologies. A. Chatziioannou, I. Kanaris, C. Doukas, P. Moulos, F.N. Kolisis and I. Maglogiannis (IEEE TRANSACTIONS ON INFORMATION TECHNOLOGY IN BIOMEDICINE, 2011 15 (1), art. no. 5638146, pp. 83-92.
  2. A. Chatziioannou, I. Kanaris, I. Maglogiannis, C. Doukas, P. Moulos, E. Pilalis and F. Kolisis : “GRISSOM web based Grid portal: Exploiting the power of Grid infrastructure for the interpretation and storage of DNA microarray experiments” In Proc of  9th IEEE International Special Topic Conference on Information Technology in Biomedicine (ITAB 2009) Larnaka Cyprus
  3. KEGGconverter: a tool for the in-silico modelling of metabolic networks of the KEGG Pathways database. K.Moutselos, I.Kanaris, A.Chatziioannou, I. Maglogiannis, F.N. Kolisis (BMC Bioinformatics 10:324), 2009. (featured article of the volume, characterized as Highly Accessed).
  4. C. Doukas, I. Maglogiannis, A. Chatziioannou, “Certification and Security Issues in Biomedical Grid Portals: The GRISSOM Case Study”, “Certification and Security in Health-Related web applications: Concepts and Solutions” IGI Press (to appear)
  5. A. Chatziioanou, I. Maglogiannis, I. Kanaris, C. Doukas, E. Pilalis, P. Moulos, F. Kolisis, “GRISSOM: A Web based portal and repository for interpretation and storage of DNA microarray experiments”, presented at 4th EGEE User Forum/OGF 25 and OGF Europe’s 2nd International Event, 2-6 March 2009, Catania, Italy.

 

 

Density Functional Theory Calculations on Atmospheric Degradation Reactions of Fluorinated Propenes

Scientists from NCSR “Demokritos”University of CreteNational Oceanic and Atmospheric Administration and  University of Colorado  have used the HellasGrid Infrastructure and the EGI Grid infrastructure in order to solve problems coming from the area physics-chemistry.

Halogenated organic compounds are being effectively utilized as industrial solvents, fire-suppressants and refrigeration media for several decades, the first and best known class being ChloroFluoroCarbons (CFC). However, CFCs are harmful to the protective layer of stratospheric ozone, attributed to the presence of chlorine atoms. Thus, new classes of CFC alternatives containing no chlorine atoms are appropriately devised, which must additionally possess short atmospheric lifetimes and a minimal contribution to global warming. An extensive investigation of their reactivity is primarily performed by experimental studies, in order to assess their impact in atmospheric quality. Since experiments may not be able to provide all the answers, assistance is addressed to molecular quantum mechanical calculations.

The atmospheric reactivity of two hydrofluoro-olefins (HFOs), CF3CF=CH2 (2,3,3,3-tetrafluoropropene, HFO-1234yf) and (Z)-CF3CF=CHF (1,2,3,3,3-pentafluoropropene, HFO-1225ye), was investigated both experimentally and theoretically. Theoretical calculations were performed by Density Functional Theory (DFT) in order to elucidate several aspects of HFOs chemistry in the atmosphere.

Theoretical calculations yielded equilibrium molecular structures along with vibrational frequencies and the absolute electronic energies at reliable levels of theory comprising of the B3P86 functional with the aug-cc-pVDZ and aug-cc-pVTZ basis sets, respectively. The data were subsequently fed into statistical thermodynamics formulas to compute the reaction enthalpies. The most likely sites for the Cl, OH and NO3 addition to the double bond of each HFO were determined, and the energetics of the initial adduct formation pathways helped to explain the experimentally observed pressure dependence difference between Cl and OH reactions. Furthermore, the calculated exothermicities for the degradation reactions of peroxy radicals in the presence of O2, NO and HO2 yielded the most likely atmospheric degradation products of HFOs, in support of the experimental results.

Figure 1 - Enthalpy diagram (298.15 K) for the reaction of Cl, NO3, and OH with CF3CF=CH2 and formation of peroxy adducts calculated at the B3P86/aug-cc-pVTZ level of theory.
As, a large amount of calculations was needed, due to the number of molecular species involved in this work, the computing power of the South-Eastern European Virtual Organization (SEE-VO) was effectively utilized to accomplish the task. Automation of the job submission and data retrieval processes was performed by shell scripts using commands of the gLite middleware.

Future plans include the examination of the degradation mechanism of halogenated organic molecules as harmful and toxic pollutants in aqueous  environments is currently being investigated using Density Functional Theory (DFT).

Contacts

  • Dr. Yannis G. Lazarou, Institute of Physical Chemistry, NCSR “Demokritos”, Aghia Paraskevi, Attiki, Greece, lazarou (at) chem.demokritos.gr
  • Dr. Vassileios C. Papadimitriou, Department of Chemistry, University of Crete, Heraklion, Greece, bpapadim (at) chemistry.uoc.gr
  • Dr. James B. Burkholder, ESRL, CSD, National Oceanic and Atmospheric Administration (NOAA), Boulder, Colorado, USA, James.B.Burkholder (at) noaa.gov
  • Dr. Ranajit K. Talukdar, CIRES, University of Colorado, Boulder, Colorado, USA, Ranajit.K.Talukdar (at) noaa.gov

Reference

  1. Atmospheric Chemistry of CF3CF=CH2 and (Z)-CF3CF=CHF: Cl and NO3 Rate Coefficients, Cl Reaction Product Yields, and Thermochemical Calculations” V.C. Papadimitriou, Y.G. Lazarou, R.K. Talukdar, J.B. Burkholder, J. Phys. Chem A2011, 115, 167 – 181.

Investigating the nature of explosive percolation transition

The Laboratory of Computational Physics is actively involved in the field of investigating the phase transition of various natural and artificial systems. Currently, much effort is being concentrated on the definition of the type of phase transition for a new competitive model named “explosive” percolation: when filling sequentially an empty lattice with occupied sites, instead of randomly occupying a site or bond (according to the classical paradigm), we choose two candidates and investigate which one of them leads to the smaller clustering. The one that does this is kept as a new occupied site on the lattice while the second one is discarded (Figure 1). This procedure considerably slows down the emergence of the giant component, which is now formed abruptly, thus the term “explosive”.


Achlioptas Process

Figure 1: Achlioptas Process according to the sum rule (APSR) for site percolation. White cells correspond to unoccupied sites while colored cells correspond to occupied sites. Different colors (red,green,gray,blue) indicate different clusters. (a) We randomly select two trial unoccupied sites (yellow), noted by A and B, one at a time. We evaluate the size of the clusters that are formed and contain sites A and B, (s_A) and (s_B) respectively. In this example (s_A = 10) and (s_B = 14). (b) According to the Achlioptas Process, we keep site A which leads to the smaller cluster and discard site B.

 

Following the first publication of Achlioptas et al., a debate was initiated between various teams whether the procedure is continuous or discontinuous. Contributing to these considerations, we have investigated explosive site percolation, both using the product and sum rules. It was found that the exponent (beta / nu ) is vanishing small for both cases, pointing towards the continuity of the transition. Also, we performed numerical analysis for the case of a reverse Achlioptas process (Figure 2). It was shown that for finite systems there is a hysteresis loop between the reverse and forward procedure (Figure 2). This loop vanishes at infinity, giving strong evidence for the continuity of the “explosive” site percolation (Figure 3). Moreover, “explosive” site and bond percolation seem to belong to a different universality class

        

Figure 2: Reverse Achlioptas Process (AP1) for site percolation according to the sum rule. Blue is for the occupied sites while white for the unoccupied sites. Initially, the lattice is fully occupied. (a) An instance of the process. We randomly choose two trial sites (yellow), noted as A and B, and remove them from the lattice. (b) The clusters formed after the removal. (c) We place site A again in the lattice and calculate the size of the cluster in which it belongs, (s_A = 16). (d) We do the same as before for the case of site B and calculate (s_B = 26). We remove site A which leads to the formation of the smaller cluster and keep site B.

Reverse Achlioptas Process (1)
Reverse Achlioptas Process (2)

 

Hysteresis Loop

Figure 3: (a) Hysteresis loop between a reverse (red dots) and the forward (black squares) Achlioptas process for a (700times 700) system. (b) The loop vanishes in the thermodynamic limit

 

Simulations were performed on the EGI. A diagram of the number of jobs and CPU hours consumed per month is shown in Figure 4. We have used extensively the gLite parametric job submission mechanism, using as parameter the different realizations of the system. On average, more than 1000 jobs per simulation were submitted for each lattice size. Considering a typical ( 1000 times 1000 ) lattice, the average time consumed for one run approached 172 minutes. If we had to perform the calculations on a single CPU, this would mean that it would take us 120 days to get complete results for just one lattice size. Using the EGI thus has helped us minimize this time approximately to 172 minutes. This translates to a time gain of the order of (10^3). Moreover, given the availability of more resources, this gain may be even higher. This is a very important feature, because we can numerically analyze systems of the order of (10^6) in a tolerable amount of time.

Achlioptas Jobs

Figure 4: Number of jobs and CPU hours per month consumed for the simulations

 

References:

  1. D.Achlioptas, R.M. D’Souza and J. Spencer, Explosive Percolation in Random Networks, Science 323,p. 1453 ,(2009)
  2. R.A. da Costa, S.N.Dorogovtsev, A.V.Goltsev, and J.F.F.Mendes, “Explosive Percolation” Transition is Actually Continuous, Physical Review Letters 105(25),255701,(2010)
  3. P.Grassberger, C. Christensen, G. Bizhani, S-W Son, and M. Paczuski, Explosive Percolation is Continuous, but with Unusual Finite Size Behavior, Phys Rev Lett. 106(22), (2011)
  4. O. Riordan and L. Warnke, Explosive percolation is continuous, Science 333 (2011)
  5. R.M. Ziff, Explosive growth in biased dynamic percolation on two-dimensional regular lattice networks, Physical Review Letters 103(4),45701,(2009)
  6. F. Radicchi and S. Fortunato, Explosive Percolation: A numerical analysis, Physical Review E 81(3),036110,(2010)
  7. N.A.M. Araújo and H.J.Hermann, Explosive Percolation via Control of the Largest Cluster, Physical Review Letters 105(3),035701,(2010)

First-principles studies on traditional and emerging materials

Scientists from the Computational Materials Science Group, National Technical University of Athens (NTUA) have used the HellasGrid Infrastructure and the EGI Grid infrastructure in order to solve problems coming from the area of density-functional theory (DFT). Specifically, they used the Vienna Ab Initio Simulation Package (VASP), a widely used code that performs so-called first-principles calculations on materials within the framework of density-functional theory (DFT).

The employment of traditional and emerging materials in technological applications presupposes a detailed knowledge of their physical properties. An accurate description of these properties at the atomic-scale is linked to the solution of quantum-mechanical (QM) differential equations, a task of immense challenges due to interactions among electrons in extended systems. DFT codes such as VASP utilize one of the most popular theoretical approaches, the so-called density-functional theory, to solve the QM equations and describe, thus, the electronic, chemical, mechanical, optical, or transport properties of a plethora of physical systems (bulk solids, surfaces, nano-systems of different dimensionality, molecules, etc).

Figure 1 - Energy variation during diffusion of a C dopant in TiO2

The scientists from the Computational Materials Science Group have undertaken a number of investigations [1-8] using the HellasGrid infrastructure to perform DFT calculations. These studies probed the properties of materials that have attracted strong interest in recent years. Examples include TiO2, a material with potential use in photovoltaics and photocatalysis,[1], [2] organic semiconductors,[3], [6]  layered hard systems, [4], [5] and materials employed in novel electronic devices. [7], [8] The associated publications [1], [8] provide extensive information on how the detailed knowledge of the QM properties of the above systems can lead to further optimization of related applications.

As VASP is a parallel code that utilizes MPI to distribute the workload to different nodes, first-principles calculations are computationally intensive and the use of a large network of clusters like the HellasGrid is indispensable for this type of studies. The number of nodes required varies from problem to problem, with small jobs typically running on 4-8 nodes, to larger tasks of 32 or more nodes.

DFT calculations are currently underway to probe the properties of several other important materials, such as graphene and other two-dimensional materials, carbon nanotubes, silicon nanowires, organic photovoltaics, and others. The completion of these studies will lead to new publications and boost the scientists’ understanding for systems that play a key role in emerging technologies.

Contacts

  • Leonidas Tsetseris, Assistant Professor, NTUA, leont (at) mail.ntua.gr
  • Georgios Volonakis, PhD candidate, AUTH, gvolo (at) physics.auth.gr
  • Evangelos Golias, PhD candidate, NTUA, vgolias (at) ims.demokritos.gr

References

  1. Stability and dynamics of carbon and nitrogen dopants in anatase TiO2”, L. Tsetseris, Physical Review B 81, 165205 (2010).
  2. Configurations, electronic properties, and diffusion of carbon and nitrogen dopants in rutile TiO2”, L. Tsetseris, Physical Review B 84, 165201 (2011).
  3. Stability of Group-V Endohedral Fullerenes”, L. Tsetseris, Journal of Physical Chemistry C 115, 3528 (2011).
  4. Electronic and structural properties of TiB2: Bulk, surface, and nanoscale effects”, G. Volonakis, L. Tsetseris, and S. Logothetidis, Materials Science and Engineering B 176, 484 (2011).
  5. Excess of boron in TiB2 superhard thin films: a combined experimental and ab initio study”, N. Kalfagiannis, G. Volonakis, L. Tsetseris, and S. Logothetidis, Journal of Physics D 44, 385402 (2011).
  6. “Impurity-related vibrational modes in a pentacene crystal”, G. Volonakis, L. Tsetseris, and S. Logothetidis, European Physical Journal: Applied Physics 55, 23903 (2011).
  7. “Ge volatilization products in high-k gate dielectrics”, E. Golias, L. Tsetseris, A. Dimoulas, and S. T. Pantelides, Microelectronic Engineering 88, 427 (2011).
  8. “Ge-related impurities in high-k oxides: Carrier traps and interaction with native defects”, E. Golias, L. Tsetseris, and A. Dimoulas, Microelectronic Engineering 88, 1432 (2011).

 

Classical and ab-initio Molecular Dynamics of molecular and/or ionic systems

Scientists from Institute of Accelerating Systems & Applications  (IASA) have used the HellasGrid Infrastructure and the EGI Grid, in order to prepare the equilibrated systems to be used in the context of PRACE European project (Work Package 7.4 “Benchmarking” framework).

In order to have a good estimation of performance and scaling up to some thousands cores of these applications, well equilibrated initial configurations are necessary. For each package a number of physical systems were prepared. The prepared configurations cover various simulation system sizes and methods. Each of these configurations was equilibrated using grid infrastructure. A minimal run of 103 steps for each case of Gromacs and NAMD (in order to have reproducible performance and scaling) takes about 10-30 minutes using 1024-8192 cores on Tier-0 systems (170-4096 core hours). A typical equilibration step needs, depending on the system, more than 105 steps. For Cp2k, much more time is necessary to obtain equilibrated initial configuration (each step for large cases takes ~2 hours using 2048 cores on BG/P machine at Juelich).

In all the aforementioned cases, the equilibration runs were performed using a number of save/restart jobs on Hellasgrid infrastructure, using from 8 up to 32 cores for each job. These applications, and few more (Quantum Espresso, Towhee), were ported and optimized to run on the existing Hellasgrid architectures. As far as Gromacs concerns, it has an internal CPU detection mechanism, hence the corresponding executable is internally optimized for different architectures. For NAMD and Cp2k cases respectively, different executables were produced taking into account the existing hardware. Currently, two major versions of executables were produced: one that runs on all Hellasgrid sites, with optimization for CPUs that have SSE2 instructions and one for CPUs that have SSE4.1 instructions (HG-06). In all cases openmpi-1.4.3 was used as parallel environment. For NAMD and Cp2k, the free unsupported version of Intel compilers was used, obtaining an additional ~2x performance boost. In order to avoid installation of Intel compilers suite in the Hellasgrid clusters, static Intel libraries linking was used. Additional math libraries were required by these packages (fftw2, fftw3, gsl, lapack,  Atlas, libint, Blacs, Scalapack). These libraries were compiled and installed on UI (Users Interface service) machine(s). Static linking was followed in every case, thus installation on WNs (Working Nodes) was not necessary.

The HellasGrid Infrastructure provided the CPU resources needed to proceed with the preparation of the equilibrated initial configurations. These packages will be available, for use on Virtual Organization basis, on Hellasgrid clusters soon (this is a work in progress).

Contact details:

  • Marios Chatziangelou, IASA, mhaggel (at) iasa.gr
  • Dimitris Dellis IASA, ntell (at) iasa.gr
  • HellasGrid Application Support Team, IASA, application-support (at) hellasgrid.gr

Protein classification algorithms over a distributed computing environment

One of the most important challenges in modern Bioinformatics is the accurate prediction of the functional behavior of proteins. To this end, researchers from the Intelligent Systems and Software Engineering Lab (Dept. of Electrical and Computer Engineering) have been working successfully for several years on the design and implementation of novel data mining algorithms [1-3].

The strong correlation that exists between the properties of a protein and its motif sequence (Figure 1) makes the prediction of protein function possible. The core concept of any approach is to employ data mining techniques in order to construct models, based on data generated from already annotated protein sequences. A major issue in such approaches is the complexity of the problem in terms of data size and computational cost. However, the utilization of the HellasGrid Infrastructure and the EGI Grid, coupled with the close support of the Scientific Computing Center at A.U.Th., helped overcome the computational difficulties often encountered in protein classification problems.

Figure 1: [a] P00747 (Plasminogen precursor – PLMN_HUMAN) protein chain, and [b] an amino-acid pattern expressed as a regular expression

 

G-Class was the first data-mining algorithm successfully ported to the EGI Grid infrastructure [4]. The G-Class methodology follows a “divide and conquer” approach comprised of 3 steps (Figure 2).

Figure 2: First, protein data from PROSITE, an expert-based database, are divided into multiple disjoint sets, each one preserving the original data distribution. The new sets are used as training sets, and multiple models are derived by means of standard data mining algorithms. Finally, the models are combined to produce the final classification rules, which can be used to classify a given instance and evaluate the methodology.

 

G-Class was a fairly simplistic approach to the protein classification problem, using generic data mining algorithms for the construction of several models simultaneously. However, the results were impressive both in terms of the speed-up ratio (ranging from 10 to 60) and the amount of data (ranging from 662 proteins over 27 different classes, to 7027 proteins over 96 classes) that were able to be processed (Figure 3).

Figure 3: The processing time in all cases follows the (e^{-alpha x}) model, where (alpha) depends on the size of the original dataset and (x) is the number of splits. The accuracy of the methodology is fairly constant over the number of splits, with minor fluctuations owing to the distribution of the instances of the overlapping protein classes over the different dataset splits.

 

A second approach was aiming towards the automatic annotation of protein sequences. Although there are a lot of tools for protein annotation, such as the Gene Ontology Project, ProDom, Pfam, and SCOP, in order to assign annotation terms to new non-annotated protein sequences, they have to be either processed directly in a lab or characterized through similarity to already annotated sequences. At the moment, the amino acid sequence of more than 1.000.000 proteins has been obtained. On the contrary, the properties and functions of only 4% of these proteins are known. Therefore, the need for a systematic way to derive clues for the properties of a protein by inspecting its amino acid sequence is obvious. PROTEAS is a novel parallel methodology for protein function prediction which predicts the annotation of an unknown protein, by running its motif sequence each model, producing similarity scores [5-6]. This methodology has been implemented so that it can effectively utilize various classification schemata, such as Gene Ontology, SCOP families, etc (Figure 4).

Figure 4: PROTEAS workflow diagram

 

The main drawback of this methodology is that it requires a substantial amount of computational time to complete. It has been shown experimentally that the execution time needed to process the entire dataset on a single processor is prohibitively long. In order to address this issue, PROTEAS has been implemented both as a standalone and as a grid-based application. The grid-based application utilizes the MPI library for communication between distinct processes and uses the EGI Grid infrastructure in order to minimize the execution times (Figure 5).

 

Figure 5: Execution times for model training

 

Moreover, the Grid provides for the seamless integration of the training process and the actual model evaluation by allowing the concurrent retraining of Gene Ontology models from different input sources or experts and the use of the existing ones (Figure 6).

Figure 6: Execution times for specific Train/Test set ratio and different number of input files (left column), and for different ratios but specific number of input files (right column)

 

The application was executed on available clusters using from 4 to 16 processors in various experiment configurations (Figure 7). In all cases the accuracy of the results was very high and the overall execution time was satisfactory.

 

Figure 7: Total processing times for the classification of a single protein sequence, based on the number of CPUs used and the number of input files used as the model construction base.

 

Contact details:

  • Pericles A. Mitkas, Professor, AUTH, mitkas (at) eng.auth.gr
  • Fotis E. Psomopoulos, Research Associate, CERTH, fpsom (at) issel.ee.auth.gr
  • Scientific Computing Center, AUTH, contact (at) grid.auth.gr

 

References:

  1. Fotis E. Psomopoulos and Pericles A. Mitkas, “Bioinformatics Algorithm Development for Grid Environments”, Journal of Systems & Software, vol. 83, No 7. (2010), pp. 1249-1257.
  2. Fotis E. Psomopoulos and Pericles A. Mitkas, “Data Mining in Proteomics using Grid Computing”, Handbook of Research on Computational Grid Technologies for Life Sciences, Biomedicine and Healthcare, Editor: Mario Cannataro, Laboratory of Bioinformatics, University Magna Graecia of Catanzaro, 88100 Catanzaro, Italy, 2009, (chapter 13, pp. 245-267), UK: IGI Global.
  3. Fotis E. Psomopoulos and Pericles A. Mitkas: “Sizing Up: Bioinformatics in a Grid Context”, 3rd Conference of the Hellenic Society For Computational Biology and Bioinformatics – HSCBB ’08, 30-31 October 2008, Thessaloniki, Greece.
  4. Helen Polychroniadou, Fotis E. Psomopoulos and Pericles A. Mitkas: “g-Class: A Divide and Conquer Application for Grid Protein Classification”, Proceedings of the 2nd ADMKD 2006: Workshop on Data Mining and Knowledge Discovery (in conjunction with ADBIS’2006: The 10th East-European Conference on Advances in Databases and Information Systems), 3-7 September 2006, Thessaloniki, Greece, pp. 121-132.
  5. Christos N. Gkekas, Fotis E. Psomopoulos and Pericles A. Mitkas, “A parallel data mining application for Gene Ontology term prediction”, 3d EGEE User Forum, Polydome Conference Centre, 11-14 February 2008, Clermont-Ferrand, France.
  6. Christos N. Gkekas, Fotis E. Psomopoulos and Pericles A. Mitkas, “A parallel data mining methodology for protein function prediction utilizing finite state automata”, presented at the 2nd Electrical and Computer Engineering Student Conference, April 2008, Athens, Greece.