Maven Indexer¶
What Is The Maven Indexer?¶
The Maven Indexer is a library created by Sonatype Inc. and contributed to the Apache Maven team. It can produce a Lucene Index and carry out queries against it. This packed index can be downloaded from the repository and be used by:
- Artifact repository managers
- Check if a remote host contains an artifact, class, etc
- Check what versions of an artifact exist
- IDE-s
- Resolve dependencies
- To find out which packages contain the class you would like to import
- Custom tools
How Does The Maven Indexer Work?¶
The Maven Indexer uses:
- The artifact's
.pom
file - The artifact file itself
- The artifact's file system path
to figure out information about the artifacts which it adds to a Lucene index. This index can then be packed and downloaded by consumers.
What Kind Of Information Does The Maven Indexer Keep?¶
The Maven Indexer keeps a record of the following information:
- Basic artifact file attributes: last modified, file extension, file size
- Source and/or javadoc existence flags
- GAV coordinates (
groupId
,artifactId
,version
,packaging
,classifier
) - SHA-1 checksums (+ whether the related signature file exists)
What Artifacts Does The Maven Indexer Index?¶
The indexer will add a record to the Lucene index for every Maven artifact that it recognizes (the file name and
artifact path have to match the Maven artifact repository storage rules), regardless of whether or not
it has a .pom
file. Artifacts that have several classifiers will have all their sub-artifact files added to the index,
except a record for the POM.
Maven indexer also may or may not store a .pom
file as an artifact. However, firstly it tries to find matching real
artifact file in the file system, switching over to indexing that, instead of the .pom
file.
What's not indexable ?¶
The following file types are not indexable:
maven-metadata.xml
files.properties
files- checksum and signature files
.asc
,.md5
,.sha1
What Are Packed Indexes?¶
Packed indexes are either a complete compressed index, or a compressed subset of data which can be applied to an existing index incrementally.
What's the goal of packed indexes ?¶
Packed indexes are used for transferring indexes from the remote to the proxy/tool.
What Is The Maven Indexer Used For In The Strongbox Project?¶
The Maven Indexer is used for integration with IDE-s.
How Does The Maven Indexer Work In Strongbox ?¶
Strongbox allows you to download packed repository Maven Index. Every maven repository with indexing enabled serves the packed Maven Index. Based on the repository type, the index is prepared as follows:
- Hosted repositories:
Strongbox stores the information of uploaded artifacts (in hosted repositories) in the OrientDB database. This Information is used to create the hosted repository Maven Index. Strongbox serves following Maven Indexer Fields in indexer:
- artifactId;
- version;
- classifier;
- packaging/extension
- classnames
- lastModified
- size
- signatureExists
- sha1
- sourcesExists
- javadocExists
For each hosted maven repository defined in strongbox there should be a scheduled task configured to rebuild the index (unless you don't
want to serve Maven Index for some repository). Rebuild Maven Indexes Cron Job
is scheduled at strongbox startup on time specified by the cronExpression
value within repositoryConfiguration
in strongbox.yaml
configuration file for repository
with type
equal to hosted
.
The process of rebuilding the hosted repository Maven Index purges previous index and recreates it from scratch using OrientDB to keep more accurate information. Thanks to this the index is up-to-date, as the real server's contents.
- Proxy repositories:
Strongbox fetches the proxy repository Maven Index from remote host, stores it locally and serves it. For each proxy maven repository
defined in strongbox there should be a scheduled task configured to re-fetch the index (unless you don't want to serve Maven Index
for some repository). Download Remote Maven Index Cron Job
is scheduled at strongbox startup on time specified by the cronExpression
value within repositoryConfiguration
in strongbox.yaml
configuration file for repository
with type
equal to proxy
.
Strongbox supports incremental proxy repository Maven Index. It means that it will update the index by downloading only the missing Maven Index parts that were not downloaded before. Thanks to this feature, strongbox saves the bandwidth costs. Once the soft parts are downloaded, they are merged with the locally existing part and finally packed.
The Maven indexes produced by most public repository managers (such as Maven Central), are usually rebuilt once a week.
- Group repositories:
Strongbox creates the group repository Maven Index by merging their underlying repositories Maven Indexes. This process is recursive meaning that
root group repository will contain in the Maven Index all the information stored in every inner and outer vertex repository Maven Index.
Merge Maven Group Repository Index Cron Job
is scheduled at strongbox startup on time specified by the cronExpression
value within repositoryConfiguration
in strongbox.yaml
configuration file for repository
with type
equal to group
.
The process of rebuilding the group repository Maven Index purges previous index and recreates it from scratch to keep more accurate information.
Where Are The Maven Indexes Located in Strongbox ?¶
There are two types of Maven Indexer indexes:
- Local
- For hosted repositories, this contains the artifacts that have been deployed to this repository.
- For group repositories, this contains the merged index from the underlying repositories.
- Remote
- This is downloaded from the remote repository and contains a complete index of what is available on the remote.
Every repository (with enabled indexing) has an index under the strongbox-vault/storages/${storageId}/${repositoryId}/.index
directory
where the index is located.
- Hosted repositories: have:
- Local:
strongbox-vault/storages/${storageId}/${repositoryId}/local/.index
- Local:
- Proxy repositories have:
- Remote:
strongbox-vault/storages/${storageId}/${repositoryId}/remote/.index
- Remote:
- Group repositories have:
- Local:
strongbox-vault/storages/${storageId}/${repositoryId}/local/.index
- Local:
How to force to rebuild the repository index ?¶
Use REST API endpoint:
POST
/api/maven/index/{storageId}/{repositoryId}
- see also MavenIndexController
How to download the packed repository index ?¶
Use REST API endpoint:
GET
/storages/{storageId}/{repositoryId}/.index/nexus-maven-repository-index.gz
- see also MavenArtifactController
Information For Developers¶
The code for the Maven indexing is located under the strongbox-storage-maven-layout-provider module.
See Also¶
- Maven Indexer: Github
- Maven Indexer: About
- Maven Indexer: Fields
- Maven Indexer: Core (Notes)
- Maven Indexer: Examples
- Maven Indexer: Incremental Downloading
- Maven Indexing: Indexing (Part I)
- Maven Indexing: Searching (Part II)
- Maven Indexing: Packing (Part III)
- Maven Metadata
- Stackoverflow: [maven-indexer]