EuResist - Integration of viral genomics with clinical data to predict response to anti-HIV treatment
- EC contribution
- : € 2.143.000
- Duration
- : 30 months
- Starting date
- : 01/01/2006
- Funding scheme
- : Specific Targeted Research Project
- Keywords
- : biomedical informatics, decision support systems, HIV, drug resistance, artificial intelligence
- Contract/Grant agreement number
- : IST-2004-027173
- Project web-site
- : http://www.euresist.org
Background:
The EuResist project has developed a European integrated system for clinical management of antiretroviral drug resistance. The system will provide the clinicians with an on-line prediction of response to antiretroviral treatment in HIV patients, thus helping the clinicians to choose the best drug combinations for any given HIV genetic variant. To this end a large European integrated data set has been created, merging resistance databases from Italy (ARCA), Germany (AREVIR) and Sweden (Karolinska) . The EuResist integrated database is probably the largest source of information available about the clinical implications of resistance to antiretrovirals.
The EuResist Combined Prediction System is formed by an array of three predictors, each based on its own definition of features and "derived features", used to extract the maximum information from the data set:
- Generative-Discriminative.
- Evolutionary.
- Higher order interaction.
The Combined predictive System has been validated through a comparative study with the reference rules-based algorithms most commonly used for HIV genotype interpretation. The outcome shows that the EuResist Combined System performs better than the current state of the art.
Also, an Engine-Versus-Experts (EVE) evaluation study is currently ongoing with encouraging preliminary results.
Problem:
While combination antiretroviral therapy has made HIV infection a treatable condition, eradication of infection is not yet achievable and antiretroviral therapy needs to be administered as a prolonged, possibly lifelong treatment. Long-term toxicity, difficulty in adhering to complex regimens, possible pharmacokinetics problems, and intrinsically limited potency are all factors favouring the selection of drug-resistant viral strains. Development of drug resistance is nowadays a major cause for treatment failure.
Aim:
The EuResist approach is based on using viral genotype data integrated with treatment response data from clinical practice to predict the efficacy of different antiretroviral regimens in the presence of a given HIV genotype and additional parameters. This strategy bypasses the original genotype-phenotype correlation step and points directly to indicate the drug combinations expected to be most effective in clinical practice. In line with this approach EuResist objectives were:
- To integrate biomedical information from three large and expanding databases in different European countries collecting the required critical mass of historical and prospective data;
- To define the Standard Datum to be extracted from the database to maximize the useful information, including HIV genotype and additional clinical information;
- To develop and validate several models for prediction of the response to treatment based on the defined Standard Datum and on additional features;
- To combine the best performing models into the final EuResist Combined Predictive System and to make it available on the web for optimization of antiretroviral treatment.
Expected and obtained results:
The EuResist Integrated Data Base has been realised by physically merging the founding ARCA, Arevir and Karolinska databases (fig. 1). Thanks to expanding collaborations new cohorts are joining the database, like the Luxemburg one. At now, the EuResist database counts 18,467 patients, 64,864 therapies, 22,006 polymerase sequences, 240,795 viral load measurements, being the largest existing HIV resistance database.

Fig. 1: The EuResist Integrated DataBase
A Standard Datum has been defined in compliance with the definition proposed by the Forum for Collaborative HIV Research (www.hivforum.org) but including additional features. Each instance is given by viral genotypes and clinical markers around a Treatment Change Episode (TCE) and by additional features like clinical and demographic data. Follow up variables are considered from eight to twelve weeks after TCE with the aim to predict short-term therapy outcome. Virological success is defined as the achievement of less than 500 copies of HIV RNA load or at least a 2-log decrease from the baseline value.
Different predictive methods have been developed and compared. Further to the variables in the Standard Datum, a number of derived features have been considered. Three engines have been chosen as the best performing ones, using different derived features:
- Generative-Discriminative (GD). The model uses a Bayesian network to derive the probability of therapy success on the basis of clinical markers measures only (without genotype), thus taking the chance to use information contained in the DB otherwise discharged by the definition of the Standard Datum. Together with other features this is used for the classification by Logistic Regression.
- Evolutionary (EV). A model of viral evolution under the selective pressure exerted by a specific drug permits to derive a probability measure of the likelihood of the virus to remain susceptible to that drug: the so called "Genetic Barrier" to resistance. Together with other features, the Genetic Barrier is used as input for a Logistic Regression classification model.
- Higher order interaction, or Mixed Effects (ME). A number of second and third order interactions among variables (drug x drug, drug x drug x drug, drug x mutation .) are considered, together with others, as features for a Logistic Regression learning method, thus accounting for the effect of modern multi-drug regimens.
Several combination methods of the three selected engines have been studied to realise the final EuResist Combined Predictive System. A linear combination of these algorithms has been chosen as the best performing and most understandable method.
The performances of the three methods, separately and combined, have been evaluated on a test set of data (10% of the data kept aside for this) and compared with the state of the art Stanford HIVDB. The results (fig. 2) show that EuResist Combined Prediction System performs better than the state of the art.
Also, an Engine-Versus-Experts (EVE) evaluation study is currently ongoing. First results show that the engine performance is comparable with that of best known international level human experts.

Fig. 2: The EuResist best performing individual engines and Combined Prediction System compared with Stanford HIVdb for classification of treatment outcome in the validation data set.
Finally, the EuResist web interface has been developed under the guidance of virologists and physicians. Free web access to the EuResist Combined Prediction System will be provided to research and medical communities for clinical management of antiretroviral drug resistance.
In order to deploy the EuResist project results and carry on any related activity, a European Interest Grouping has been formed: the EuResist Network GEIE, comprising Informa s.r.l. (administrator), Karolinska Institutet, Max Plank Institut, Università di Siena.
Expected advantages of the EuResist system include not only more effective care for patients but also decreased cost of therapy through avoidance of improper use of antiretroviral drugs and possible stopping of prophylaxis of opportunistic infection along with improvement of the patient's immune status.
Potential applications:
The project can also be considered as a pilot for HCV and HBV since a large antiviral treatment intervention has been started and the chronic nature of both of these viruses is expected to lead to development of drug resistance.
Coordinator:
Informa Srl via dei Magazzini Generali 13
00184 Rome, Italy
Tel. +39 06 575 8926
Fax +39 06 575 9937
E-mail: f.incardona@informacro.info
Scientific Coordinator:
Università degli Studi di Siena
Department of Molecular Biology - Section of Virology
V lotto, piano 2 Policlinico Le Scotte
Viale Bracci
53100 Siena, Italy
Tel: +39 0577 233850
Fax: +39 0577 233870
E-mail: zazzi@unisi.it
Partners:
Karolinska Institutet
Karolinska University Hospital
Stockholm, Sweden
Dr Rolf Kaiser
Universitaetsklinikum Köln
Institute of Virology
Cologne, Germany
Dr Shai Fine
IBM Israel - Science and Technology Ltd
Haifa, Israel
Prof. Thomas Lengauer
Max-Planck Gesellshaft zur Foerderung der Wissenshaften e.v.
Computational Biology and Applied Algorithmics
Saarbrücken, Germany
Dr Fulop Bazso
MTA KFKI Reszecske-ES Magfizikai KutatoIntezet
Department of Biophysics
Budapest, Hungary
Dr Andrea Petroczi
Kingston University
Kingston upon Thames,
England, UK


