Gradient retention information is very difficult to harness for compound identification because retention times are influenced by so many experimental factors - especially in gradient elution, where even the specific make/model of HPLC instrument used makes a big difference.
In order to obtain reproducible retention times, some researchers have built libraries of retention times and defined very rigid sets of experimental conditions that must be used to reproduce them. However, a system like that is extremely limiting for a user because it doesn't allow any room at all to optimize or change the HPLC method. Moreover, the requirement to use a specific make/model of HPLC instrument makes it inaccessible to many potential users and intrinsically it will become obsolete as instrumentation improves. More universal approaches including de novo prediction and approaches involving compendia of relative retention times and retention indices have also failed to gain wide use because they are not accurate enough to be useful.
For example, consider HPLC retention indexing. In HPLC retention indexing, retention is reported not as a time, but as an index describing where a compound elutes between the nearest two bracketing standard compounds that were spiked into the sample (see figure below). The most common standard compounds are homologous series of alkylphenones, 2-ketoalkanes, or 1-nitroalkanes.
The retention index (RI) in gradient elution is calculated from:
where RI is the retention index and n is the number of carbons in the smaller bracketing standard compound.
The idea is that since the bracketing standard compounds always experience nearly the same conditions as other compounds, system variables (e.g. temperature, column geometric factors, gradient profile, flow rate, etc.) affecting their retention are largely cancelled out.
Unfortunately, HPLC retention is more complicated than that. In the figure on the right, isocratic retention vs. eluent composition relationships are plotted for three different compounds.
Now retention indexing assumes that a compound will always elute at the same position between two bracketing compounds. But if you use the retention of amitriptyline and indole as bracketing compounds to predict the retention of acetophenone, it will depend heavily on solvent composition. Acetophenone doesn't even elute between amitriptyline and indole at some eluent compositions! It can even be worse than that - a previous report showed that upon changing the mobile phase from 20% to 50% methanol, the retention index of aspirin changed from 302 to 8. The problem with retention indexing is that it does not account for variability in the retention vs. solvent composition relationships of different compounds.
Still, the isocratic retention information shown in the previous figure can be easily measured, so another approach to predicting gradient retention is to calculate it (or "project" it) from the isocratic retention data. To project it, one can use the fundamental equation of gradient elution, plugging in the isocratic retention information for each compound along with the programmed gradient and flow rate profiles (see below).
Unfortunately, this isn't very accurate (>10% error). That's mainly because the actual gradient and flow rate profiles produced by HPLC instruments are non-ideal. They usually differ considerably from the expected (programmed) profiles. On the right, you can see two gradient profiles measured from two different HPLC instruments, both programmed to produce the same 5 min gradient. Such gradient and flow rate distortions must be taken into account to accurately project retention.
By directly measuring the gradient profiles and accounting for them in our calculations of gradient retention, we were able to project retention with 1-3% accuracy (as a percentage of the gradient time). This is certainly high enough accuracy to be very useful as supplementary information for compound identification, but it has a serious flaw: very precise measurement of the gradient profile is a meticulous process. Most people would not be interested in going to all that effort.
Back-calculation of gradient and flow rate profiles
Instead, we discovered a much simpler, more precise way to account for the true shapes of the gradient and flow rate profiles produced different HPLC instruments.
First, we spike our sample with 15 compounds we call "instrument calibration solutes". There is nothing special about the instrument calibration solutes except that a) they elute over a wide range of the gradient and b) we have previously measured their isocratic retention vs. solvent composition relationships. Then we run the sample and record the retention times of the instrument calibration solutes. Using Retention Predictor, we then back-calculate what the effective gradient and flow rate profiles must have been to give those retention times.
It turns out this method of measuring gradient and flow rate profiles is extremely precise. When we then use the back-calculated profiles to project the retention of other compounds (for which we also know their retention vs. solvent composition relationships - see the retention database), the projected retention times are extremely accurate. The table below shows the experimental and projected retention times of 20 compounds in a 20 min gradient at 200 uL/min. The standard deviation among all compounds was only ±3.3 s!
Not only is this method of predicting retention extremely accurate, but its accuracy does not deteriorate when the gradient, the flow rate, or the HPLC instrument is changed. The tables below show the accuracy of our retention projections at different gradients, flow rates, and on two different HPLC instruments. In each case, the accuracy of these retention projections is unprecedented. In fact, the accuracy is nearly equal to the minimum error you could possibly expect based on the reproducibility of the isocratic measurements. This indicates that the back-calculated profiles account for virtually all instrument-related factors controlling retention.
How to use it
By using Retention Predictor to predict HPLC retention times, you have the flexibility to use virtually any gradient, flow rate, HPLC instrument, and column dimensions. However, three factors must be held constant:
1) Solvent A must be 0.1% formic acid in water and solvent B must be pure acetonitrile
2) The column temperature must be set to 35 °C
3) The stationary phase must be Waters Acquity BEH C18 (1.7 μm, 130 Å)
In the future, we will measure isocratic retention data for each compound on three different stationary phases, allowing you to choose between three phases.
To use Retention Predictor, you would perform the following four steps (note that this is a general procedure - we will provide a more detailed protocol in the future):
Step #1: Spike your sample with the instrument calibration solutes. In our previous work, we used (1) adenosine, (2) N,N-dimethylacetamide, (3) p-toluenesulfonic acid, (4) N,N-diethylacetamide, (5) indole-3-acetic acid, (6) dimethyl phthalate, (7) indole, (8) diethyl phthalate, (9) diallyl phthalate, (10) di-n-propyl phthalate, (11) di-n-butyl phthalate, (12) di-n-pentyl phthalate, (13) di-n-hexyl phthalate, (14) di-n-heptyl phthalate, and (15) di-n-octyl phthalate. We added the instrument calibration solutes to a final concentration of 100 μM each.
Step #2: Run the spiked sample. Solvent A must be 0.1% formic acid and solvent B must be pure acetonitrile. The column temperature must be set to 35 °C. The stationary phase must be Waters Acquity BEH C18 (1.7 μm, 130 Å).
Step #3: Enter the retention times of the instrument calibration solutes into Retention Predictor. Click the "Next Step" button and then the "Back-Calculate Profiles" button to back-calculate the effective gradient and flow rate profiles.
Step #4: Click the "Next Step" button and then the "Predict Retention Times" button to calculate the retention times of all the other compounds in the database.
Unfortunately, Retention Predictor isn't quite ready for anyone to use in any practical situation because the retention database only contains 35 compounds. In the near future, we plan to begin building a very large database that would allow accurate retention predictions of many thousands of compounds