Continuing NoSQL journey with MongoDB, I would like to touch one specific use case which comes up very often: storing hierarchical document relations. MongoDB is awesome document data store but what if documents have parent-child relationships? Can we effectively store and query such document hierarchies? The answer, for sure, is yes, we can. MongoDB has several recommendations how to store Trees in MongoDB. The one solution described there as well and quite widely used is using materialized path.
Let me explain how it works by providing very simple examples. As in previous posts, we will build Spring application using recently released version 1.0 of Spring Data MongoDB project. Our POM file contains very basic dependencies, nothing more.
4.0.0 mongodb com.example.spring 0.0.1-SNAPSHOT jar UTF-8 3.0.7.RELEASE org.springframework.data spring-data-mongodb 1.0.0.RELEASE org.springframework spring-beans org.springframework spring-expression cglib cglib-nodep 2.2 log4j log4j 1.2.16 org.mongodb mongo-java-driver 2.7.2 org.springframework spring-core ${spring.version} org.springframework spring-context ${spring.version} org.springframework spring-context-support ${spring.version} org.apache.maven.plugins maven-compiler-plugin 2.3.2 1.6
To properly configure Spring context, I will use configuration approach utilizing Java classes. I am more and more advocating to use this style as it provides strong typed configuration and most of the mistakes could be caught on compilation time, no need to inspect your XML files anymore. Here how it looks like:
package com.example.mongodb.hierarchical; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration; import org.springframework.data.mongodb.core.MongoFactoryBean; import org.springframework.data.mongodb.core.MongoTemplate; import org.springframework.data.mongodb.core.SimpleMongoDbFactory; @Configuration public class AppConfig { @Bean public MongoFactoryBean mongo() { final MongoFactoryBean factory = new MongoFactoryBean(); factory.setHost( "localhost" ); return factory; } @Bean public SimpleMongoDbFactory mongoDbFactory() throws Exception{ return new SimpleMongoDbFactory( mongo().getObject(), "hierarchical" ); } @Bean public MongoTemplate mongoTemplate() throws Exception { return new MongoTemplate( mongoDbFactory() ); } @Bean public IDocumentHierarchyService documentHierarchyService() throws Exception { return new DocumentHierarchyService( mongoTemplate() ); } }
That's pretty nice and clear. Thanks, Spring guys! Now, all boilerplate stuff is ready. Let's move to interesting part: documents. Our database will contain 'documents' collection which stores documents of type SimpleDocument. We describe this using Spring Data MongoDB annotations for SimpleDocument POJO.
package com.example.mongodb.hierarchical; import java.util.Collection; import java.util.HashSet; import org.springframework.data.annotation.Id; import org.springframework.data.annotation.Transient; import org.springframework.data.mongodb.core.mapping.Document; import org.springframework.data.mongodb.core.mapping.Field; @Document( collection = "documents" ) public class SimpleDocument { public static final String PATH_SEPARATOR = "."; @Id private String id; @Field private String name; @Field private String path; // We won't store this collection as part of document but will build it on demand @Transient private Collection< SimpleDocument > documents = new HashSet< SimpleDocument >(); public SimpleDocument() { } public SimpleDocument( final String id, final String name ) { this.id = id; this.name = name; this.path = id; } public SimpleDocument( final String id, final String name, final SimpleDocument parent ) { this( id, name ); this.path = parent.getPath() + PATH_SEPARATOR + id; } public String getId() { return id; } public void setId(String id) { this.id = id; } public String getName() { return name; } public void setName(String name) { this.name = name; } public String getPath() { return path; } public void setPath(String path) { this.path = path; } public Collection< SimpleDocument > getDocuments() { return documents; } }
Let me explain few things here. First, magic property path: this is a key to construct and query through our hierarchy. Path contains identifiers of all document's parents, usually divided by some kind of separator, in our case just . (dot). Storing document hierarchical relationships in this way allows quickly build hierarchy, search and navigate. Second, notice transient documents collection: this non-persistent collection is constructed by persistent provider and contains all descendant documents (which, in case, also contain own descendants). Let see it in action by looking into find method implementation:
package com.example.mongodb.hierarchical; import java.util.Arrays; import java.util.Collection; import java.util.HashMap; import java.util.Map; import org.springframework.data.mongodb.core.MongoOperations; import org.springframework.data.mongodb.core.query.Criteria; import org.springframework.data.mongodb.core.query.Query; public class DocumentHierarchyService { private MongoOperations template; public DocumentHierarchyService( final MongoOperations template ) { this.template = template; } @Override public SimpleDocument find( final String id ) { final SimpleDocument document = template.findOne( Query.query( new Criteria( "id" ).is( id ) ), SimpleDocument.class ); if( document == null ) { return document; } return build( document, template.find( Query.query( new Criteria( "path" ).regex( "^" + id + "[.]" ) ), SimpleDocument.class ) ); } private SimpleDocument build( final SimpleDocument root, final Collection< SimpleDocument > documents ) { final Map< String, SimpleDocument > map = new HashMap< String, SimpleDocument >(); for( final SimpleDocument document: documents ) { map.put( document.getPath(), document ); } for( final SimpleDocument document: documents ) { map.put( document.getPath(), document ); final String path = document .getPath() .substring( 0, document.getPath().lastIndexOf( SimpleDocument.PATH_SEPARATOR ) ); if( path.equals( root.getPath() ) ) { root.getDocuments().add( document ); } else { final SimpleDocument parent = map.get( path ); if( parent != null ) { parent.getDocuments().add( document ); } } } return root; } }
As you can see, to get single document with a whole hierarchy we need to run just two queries (but more optimal algorithm could reduce it to just one single query). Here is a sample hierarchy and the the result of reading root document from MongoDB
template.dropCollection( SimpleDocument.class ); final SimpleDocument parent = new SimpleDocument( "1", "Parent 1" ); final SimpleDocument child1 = new SimpleDocument( "2", "Child 1.1", parent ); final SimpleDocument child11 = new SimpleDocument( "3", "Child 1.1.1", child1 ); final SimpleDocument child12 = new SimpleDocument( "4", "Child 1.1.2", child1 ); final SimpleDocument child121 = new SimpleDocument( "5", "Child 1.1.2.1", child12 ); final SimpleDocument child13 = new SimpleDocument( "6", "Child 1.1.3", child1 ); final SimpleDocument child2 = new SimpleDocument( "7", "Child 1.2", parent ); template.insertAll( Arrays.asList( parent, child1, child11, child12, child121, child13, child2 ) ); ... final ApplicationContext context = new AnnotationConfigApplicationContext( AppConfig.class ); final IDocumentHierarchyService service = context.getBean( IDocumentHierarchyService.class ); final SimpleDocument document = service.find( "1" ); // Printing document show following hierarchy: // // Parent 1 // |-- Child 1.1 // |-- Child 1.1.1 // |-- Child 1.1.3 // |-- Child 1.1.2 // |-- Child 1.1.2.1 // |-- Child 1.2
That's it. Simple a powerful concept. Sure, adding index on a path property will speed up query significantly. There are a plenty of improvements and optimizations but basic idea should be clear now.
10 comments:
Thank you,
Very nice example!!!!
A lot of sharing, thanks ! I'd be pleased to have a sample of saving java.util.Map using spring-data-mongo.
Hey Boris,
Thank you for the comment. With respect to saving java.util.Map using spring-data-mongo, it's quite straightforward as every instance of the map is being saved as inner document, every key becomes field name associated with respective value (I did it many times). Please let me know if you have more specific use case so I would be able to guide you through. Thank you!
Hi Andriy,
I have a similar use case where I need to implement a file system structure using MongoDB.
I set the parent file id on the file entity. That was good enough to perform all kind of operations, until I got a requirement to perform a search and present the full path to the user.
Your proposed solution is missing (maybe I missed it) the update of the path.
Can you please suggest me a solution for such a use case?
Hey zzzz,
That's a very good question to ask. The simplest solution could be illustrated on following example (based on your use case): user moves file(s) from one folder (parent) to another. In this case you have to replace old parent with new one with two steps:
- build a new 'path' for each document (file) affected
- update all documents (files) replacing 'path' property which has old parent with new one
Something like that:
template.update( Query.query( new Criteria( "path" ).eq( oldPath ), Update.update( "path", newPath ), SimpleDocument.class );
Please let me know if that's what you are looking for.
Thank you.
Best Regards,
Andriy Redko
i need full code
Hi!
Please find the complete project on Github: https://github.com/reta/mongodb-hierarchical-data
Thank you.
Best Regards,
Andriy Redko
thanks!.... can you help how to do this
http://www.codeproject.com/Questions/809685/How-Do-I-Add-Comments-To-Mongo-Db-Using-Java-And-H?arn=28
Hi,
You mostly have all your implementation done. SimpleDocument becomes a Comment and you just need to introduce the notion of the Post with collection of Comment objects (which are hierarchical already).
Thank you.
Best Regards,
Andriy Redko
Recursive function - to get Children(might be parents too) of parents
Post a Comment