Drug Discovery & Development

Reed Business Information
Rockaway, NJ, 07866



Size and Quality Drive Compound Library Creation

Interest in small, focused libraries grows, even as the need for large compound libraries continues, but a common goal is discovery of active, drug-like molecules


Angelo DePalma, PhD
DePalma is a writer based in Newton, N.J.


Inologic's focused chemistry strategy creates small-compound libraries based on the chemical scaffolds of inositol signaling molecules, one of which is shown here. Through this approach, Inologic identified INO-4995, a cystic fibrosis drug that rebalances ion movement in lung tissue. (Source: Inologic)

Organizations may differ in library acquisition methods (e.g. synthesis versus purchase), compound inclusion criteria, synthesis methodology, or how they mine their collections to uncover leads. The rationale, however, remains constant: to find new, patentable structures as efficiently as possible.

Automated synthesis and high-throughput screening, adopted with a vengeance during the 1990s, enabled companies to synthesize, test, and maintain compound libraries populated with hundreds of thousands—even millions—of unique compounds. Although automation opened the door to new chemistry and gargantuan numbers of compounds, companies rapidly realized they had become slaves, rather than masters, of their multimillion dollar automation investments. "Companies limited their potential for creative discovery because of the millions they invested in high-tech discovery and screening," says Edward Field, CEO of Inologic, Seattle.

Diversity not enough
A typical large pharmaceutical company possesses between 1 and 10 million compounds in its various compound collections. Field estimates that industry-wide, as many as one billion molecules may be on file. Incredible numbers, especially when compared with compound collections of 40 years ago. However, even one billion structures pale in comparison with the number of possible chemical structures, molecules, estimated at about 1064 , or the much smaller number of possible organic molecules, in the neighborhood of 1014 . Even if only one in one thousand organic molecules are deemed drug-like, it's safe to say that chemists have a long way to go before they run out of structures.

Compound collections are all about chemical diversity, but random synthesis towards that end "is nothing more than a numbers game," says Pravin Chaturvedi, PhD, CEO of Scion Pharmaceuticals, Medford, Mass. Scion specializes in agents that work on ion channels, which are notoriously difficult targets that could perhaps benefit from the "shotgun" combinatorial approach. Yet, Chaturvedi is sold on smaller, more targeted libraries.

Drug discovery's reliance on large libraries, beginning in the early 1990s, was appropriate at the time but has outlived its usefulness, says Chaturvedi. "By the late 1990s, large libraries turned into uncontrollable beasts. We had reached the point of diminishing returns."

This begs the question: Aren't 1 billion compounds always better than 10 million or 1 million? The answer is yes, but as Chaturvedi points out 1 billion compounds cost a thousand times as much to synthesize and screen as 1 million, which cost 10 times as much as 100,000. At some point, organizations must decide when enough is enough. "No investor will back a company that proposed acquiring one billion compounds. You'd have better chances of winning the lottery."

Avalon Pharmaceuticals, Germantown, Md., firmly believes in chemical diversity. The company's compound libraries, at 100,000 compounds smallish by industry standards, are described by CEO Kenneth Carter, PhD, as focused, yet diverse. "Our compounds occupy maximal chemical space, with very few overlaps in either structure or chemical backbone."


click the image to enlarge

Overlaid chromatograms from 24 analyses performed in parallel using Nanostream's Veloce system. Increasing sample analysis capacity facilitates redundant analyses, which result in more statistically meaningful data. By enabling routine compound library QA, the Veloce system allows scientists to make critical decisions regarding reliability of hits early in drug discovery. (Source: Nanostream Inc.)
Avalon's screening compound collection is acquired primarily from the outside vendors. Internal synthesis efforts focus on hit-to-lead transition and lead optimization, but sometimes include limited library development. Avalon uses solution-phase chemistry almost exclusively, believing, as many medicinal chemists do, that solid-phase synthesis provides high-throughput but at a cost to chemical diversity. The reason, says Carter, is that generating large numbers of molecules at high purity demands building blocks with similar reactivity, which almost by definition results in poor diversity. Another drawback he cites is long development time.

Still, Avalon recognizes the importance of striking a balance between numbers and quality. "We have addressed this issue by using a two-stage process." In the first step, Avalon eliminates compounds containing reactive or unstable groups and undesirable molecular properties such as a high logP. Second, based on their "target-agnostic" screening approach, the company selects for maximum structural diversity, simultaneously limiting analogs for a specific chemical scaffold.

Avalon's strategy short-circuits the traditional two- to three-year target validation steps, essentially allowing of primary leads directly from genomics data, even with incomplete knowledge of the actual protein target. Eventually, its scientists may characterize targets, but this activity is part of the critical path in discovery.

Avalon's approach to library design was validated by identification of compounds active against colon and breast cancers. These molecules have hit two big discovery goals: unique structures and mechanisms of action.

Separate, equal
As combinatorial chemistry became more automated and specialized, companies created separate groups to handle synthesis, library acquisition, and compound management. Aventis, for example, maintains a 60-person combinatorial chemistry group in Tucson, Ariz., that feeds libraries of various sizes to the rest of the company. (For details, see the cover story in the July issue of Drug Discovery & Development. )

Other firms take a more traditional view, preferring not to segregate discovery-related chemical competencies. "Med-chem and parallel synthesis sources are not differentiated at Roche," says Michael Dillon, PhD, senior research scientist at Roche's research site in Palo Alto, Calif. "All med-chem groups throughout the organization have library synthesis capabilities." Like most large discovery organizations, Roche constructs libraries to target gene families, protein targets, or to expand the chemical diversity of the company compound collection. Roche's lead-generation strategy includes acquiring libraries from specialist compound suppliers, contracting with external partners for library synthesis, and developing libraries in-house.

Roche's view on library size is squarely in line with current thinking. "Industry has moved away from very large compound libraries," says Dillon. "The mid-to-late 1990s saw an explosion in compound numbers, but most companies today recognize the value from manageable, well designed compound collections."

Roche uses all modern synthetic tools, including parallel synthesis, combinatorial chemistry, and solid-phase methods, the latter principally (although not limited to) solid-phase reagents and scavengers. Resin-bound synthesis is also used, but only in situations where the technique can offer an advantage such as when very large numbers of related structures are desired.

"Solid-phase synthesis usually entails longer development times than solution phase chemistry," says Dillon.

Under today's discovery paradigm, the majority position is to focus more on activity than on numbers of compounds. "Within a high-throughput screening operation, you find a protein target, screen a million compounds, and see if you get the activity you're looking for," observes Inologic's Field. "However, therapeutic groups responsible for filing INDs care only about activity, not a compound's origin. They'll ask, 'How many active compounds can you deliver?' "

Inologic, which develops drugs that interact with disease-related inositol-signaling pathways, falls squarely into the "small, crafted" library camp, through an approach that relies on both chemical/structural and activity-based library building. By focusing on drug targets for which an ideal "drug" already exists (inositol), Inologic can fill the gaps in chemical space relatively straightforwardly. For example, the company discovered its (preclinical) cystic fibrosis drug by screening "maybe 20 to 25 compounds," says Field. "But the ones we screened showed an activity profile like nothing else we've seen. This success illustrates the power of our focused strategy."

Tools of the Trade
Many large companies do not purify individual compounds in 100,000-entry libraries, instead preferring to bias reaction pathways toward products that are reasonably pure. Eventually, when discovery chemists settle on focused panels of between 200 and 1000 compounds, they rely on a number of tools to help clean up these smallish libraries.

One method is solid-phase scavenger reagents, which can soak up substantial amounts of impurities and side products, and even drive reactions towards completion by shifting equilibria. Simple anion or cation exchange resins are popular for removing anionic or cationic species, respectively. Other resins are specific for reagents such as amines, hydrazines, or carboxylates. Sigma-Aldrich, Glycopep, Argonaut Technologies, and Polymer Laboratories, among others, offer scavenger resins suitable for end products of both solid- and solution-phase reactions.

Scavengers work well to remove major impurities from individual reactions, but top-tier libraries in the 200- to 1000-compound range probably require more careful cleaning up. According to separations/microfluidics specialist Nanostream Inc., Pasadena, Calif., analytical methods and quality control are themes its pharmaceutical customers return to time and again when they discuss needs for handling large compound collections. Nanostream's first product, Veloce, analyzes and purifies up to 24 compounds simultaneously, through an automated, microfluidic, reusable format.

Veloce offers the equivalent of 24 microcapillary reverse-phase (C18) columns with flow rates of just 10 to 15 mL per minute, so solvent waste and disposal are minimized. In one case study, Veloce's purity assessments of compounds on ten 96-well plates was within 2% of the purity by HPLC.

Veloce is not cheap. Marketing manager Surekha Vajjhala hinted at a $250,000 price tag, which is significantly higher than for a single HPLC. However, the instrument essentially replaces 24 HPLCs, and takes up much less space.
Numbers game still played

Today, the pendulum has swung back toward rational designsort of. Although chemistry directors generally prefer smaller, crafted, and more focused libraries, the "numbers game" still has its allure, especially when combined with rational design and computational methods.

Chemical Diversity Labs (CDL) Inc., San Diego, a chemistry services provider which generates more than 180,000 compounds per year for its pharmaceutical industry clients, nevertheless puts a premium on library quality versus large numbers.

CDL synthesizes as many as 800 "probe" libraries per year, each containing up to about 100 to 300 compounds. The libraries are targeted to GPCR, kinases, phosphatases, nuclear receptors, proteases, and ion channels. Nikolay Savchuk, vice president for business development and director of information technologies at CDL, notes that novel compounds are required to fill in the "depth and breadth of chemistry space." Through an approach CDL calls "bio-isosteric transformation," CDL identifies compounds that differ in structure but behave the same biologically. "We've validated this approach throughout several generations of our libraries," says Savchuk, "and developed specific tools in our ChemSoft software platform to generate broad novel chemistry space."

CDL approaches library design and synthesis in several steps. First, their medicinal chemists work with the knowledge datasets of Chemosoft and external resources, surveying scientific and patent the literature for promising protein targets and potential therapeutic applications. Second, they apply computational chemistry to establish the space of target-relevant privileged structures and propose new scaffolds and corresponding libraries by expanding the diversity and novelty from known and available chemistry of an initial compound set.

"Chemical diversity results from either scaffold or side chain diversity," says Savchuk. While side chains interact most directly with a target, scaffolds help orient those side chains and thus influence ligand potency and specificity. In this sense, a library consisting of all the permutations of 20 scaffolds and 20 side chains is more diverse than one consisting of the permutations of 10 scaffolds and 40 side chains. CDL tries to maximize template diversity while synthesizing fewer compounds around each template." The set of maximally diverse chemotypes is then subjected to in silico screening (docking, neural networks) to establish the target platform specific reagent and product space and testing for ADME-related properties (CYP, absorption, solubility and stability).

Combination of approaches
Large libraries, despite their waning appeal, are still in demand from specialty synthesis vendors. "Companies still ask us for large numbers of compounds," says William Early, PhD, assistant director for combinatorial chemistry at Albany Molecular Research, Albany, N.Y.

Early says that the hit rate can be improved, from about 4% to as high as 30%, when refined compound libraries are created from an existing hit. "There's a lot more effort these days on designing libraries for a given chemotype or pharmacophore, which amounts to second-generation combichem combinatorial methods coupled with computational chemistry."

The reason, says Early, is because even when exquisite computational and rational design tools are coupled with modern chemical methods, and when the biology is perfectly understood, no chemist can be certain what structures will be active.

Although medicinal chemistry's new tools have not yet caused product pipelines to burst, there is an overall sense of higher quality in hits and leads compared to a decade ago. According to Early, combichem is doing a good job of helping to weed out toxicity much earlier in development. "That means compounds reaching the candidate stage will succeed more often than in the old days." Biological understanding is also helping to focus discovery activities, including synthesis and library development. "The black boxes surrounding disease states are getting smaller and smaller," says Early.

"In some respects, disappointment in larger libraries is unwarranted and premature. It took almost 20 years before computational chemistry began producing drugs. I expect the same will be true for combichem. However, it is possible to use both rational and combinatorial methods to create more targeted, smaller libraries."

Exploiting serendipity
Rational design arguments notwithstanding, Albany Molecular Research's success suggests that the era of large libraries is by no means over. Jim Connelly, PhD, who directs library production at the Aventis Combinatorial Technologies Center (ACTC), Tucson, Ariz., makes an eloquent argument in favor of large libraries and the potential synergy between combinatorial methods and rational design.

Small, targeted libraries make sense when biological data is plentiful, says Connelly. "In many situations, however, you don't have as much information about the target as you'd like, which makes rational design difficult. And even when targets fall within known classes, rational design doesn't always yield new chemical matter."

For example, quite a bit is known about kinase substrates, activity, and active sites but far less is understood about G-protein-coupled receptors (GPCRs). Connelly's group would therefore probably construct a larger library against a GPCR target than against a kinase.

Larger libraries, says Connelly, opens up the doors of serendipity, a word that medicinal chemists often use pejoratively as a synonym for "luck" but which he views as a way to increase the opportunity for finding activity, even in side products and reaction junk.

According to Connelly, research directors are mistaken when they believe that combinatorial libraries are the source of chemical diversity. "That's probably not true," says Connelly, "because compounds within the library will share common scaffolds. What large libraries allow you to do is to create density within chemical space."


click the image to enlarge

Fluorous Mixture Synthesis is the first technique that allows scientists to reap the benefits of solution-phase mixture synthesis and still maintain predictable isolation of individual high-purity products. (Source: Fluorous Technologies Inc.)
Bridging Solid and Solution Chemistry
According to Wei Zhang, who heads combinatorial chemistry at Fluorous Technologies, Pittsburgh, traditional solid-phase combinatorial methods have serious shortcomings. "You can make huge libraries using solid phase, but the technique sacrifices a good deal of the versatility, kinetics, and reactivity of homogeneous reactions." Many solution-phase reactions do not transfer to solid-phase, and the bottom line is that "the technique did not generate enough chemical diversity, or enough candidates."

Founded in 2000 by Professor Dennis Curran of the University of Pittsburgh, Fluorous has developed new library-worthy chemistry, based on fluorous tags, which combines the benefits of solution-phase reactions and solid-phase separations.

Fluorous tags are perfluoroalkyl residues attached to molecules at the beginning of a synthetic sequence and removed at the end. Like solid-phase resins, Fluorous "phase tags" allow easy identification and separation of products, but in solution rather than on a resin. Thus, the company claims that the tags offer benefits of both solid-phase and heterogeneous synthesis. Fluorous tags are chemically inert, used in stoichiometric quantities, work with both normal and reverse-phase chromatography, and help speed reactions by conferring organic-phase solubility.

Fluorous residues may be used as tagging-protecting groups, as sub-structures of synthons, on reagents, or in solid-phase extraction. A Fluorous-protected substrate, for example, may be easily followed and purified during a long synthetic sequence. Synthons containing Fluorous tags label substrates with which they react and are easily removed after a particular step.

When used on reagents, the tags facilitate removal of excess or expended species. Although the list of Fluorous reagents is not huge, several important reactions are now Fluorous-enabled. Examples include Fluorous triphenylphosphine, diethyl azodicarboxylate. According to Fluorous Technologies, Fluorous formats are commercially available for organotin, coupling, oxidation, and reducing agents.


Organizations mentioned in this article:
Albany Molecular Research
Avalon Pharmaceuticals
Aventis Combinatorial Technologies Center
Chemical Diversity Labs Inc.
Inologic
Nanostream Inc.
Roche Palo Alto LLC
Scion Pharmaceuticals


© 2005 Reed Business Information a division of Reed Elsevier Inc. All rights reserved.
Use of this website is subject to its terms of use.
Privacy Policy