MongoDB 6.0 Brings Encrypted Queries, Time-Series Data Collection (thenewstack.io) 53

Posted by msmash on Tuesday June 07, 2022 @12:41PM from the moving-forward dept.

The developers behind the open source MongoDB, and its commercial service counterpart MongoDB Atlas, have been busy making the document database easier to use for developers. From a report: Available in preview, Queryable Encryption provides the ability to query encrypted data, and with the entire query transaction be encrypted -- an industry first according to MongoDB. This feature will be of interest to organizations with a lot of sensitive data, such as banks, health care institutions and the government. This eliminates the need for developers to be experts in encryption, Davidson said. This end-to-end client-side encryption uses novel encrypted index data structures, the data being searched remains encrypted at all times on the database server, including in memory and in the CPU. The keys never leave the application and the company maintains that the query speed nor overall application performance are impacted by the new feature.

MongoDB is also now supporting time series data, which are important for monitoring physical systems, quick-moving financial data, or other temporally-oriented datasets. In MongoDB 6.0, time-series collections can have secondary indexes on measurements, and the database system has been optimized to sort time-based data more quickly. Although there are a number of databases specifically geared towards time-series data specifically, such as InfluxDB, many organizations may not want to stand-up an entire database system for this specific use, a separate system costing more in terms of support and expertise, Davidson argued. Another feature is Cluster-to-Cluster Synchronization, which provides the continuous data synchronization of MongoDB clusters across environments. It works with Atlas, in private cloud, on-premises, or on the edge. This sets the stage for using data in multiple places for testing, analytics, and backup.

MongoDB 6.0 Brings Encrypted Queries, Time-Series Data Collection

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 53 Comments Log In/Create an Account

Comments Filter:

In any sane world... (Score:1, Insightful)

by Brain-Fu ( 1274756 ) writes:

...MongoDB would be regarded as humor, similar to the intended-as-humor programming languages like Ook! [esolangs.org] or whitespace [wikipedia.org].
Relational database design, with good indexing and transaction management, has proven to be the best way to build databases. It amazes me that serious software developers ever choose Mongo.
- Re: (Score:2)
  
  by jd ( 1658 ) writes:
  
  RDBMS are good where data can be compartmentalized. Blobs - there's a reason blobs aren't popular in such databases. Mongo is good for blobs that are indexed via relational databases. For caching things like credentials, I'd go for Memcached. No sense in using something heavy for something trivial.
  - Re: (Score:2)
    
    by K. S. Kyosuke ( 729550 ) writes:
    
    Relational databases literally *invented* blobs. Interbase had blobs in the mid-1980s or so.
  - Re: (Score:2)
    
    by account_deleted ( 4530225 ) writes:
    
    Comment removed based on user account deletion
    - Re: (Score:2)
      
      by jd ( 1658 ) writes:
      
      A star database uses blobs and is most definitely a real database, but in no sense of the word would it be classed as relational.
- Re: (Score:2)
  
  by kunwon1 ( 795332 ) writes:
  
  RDBMS do many things better, but not everything better. Mongo has its place - unfortunately it's often shoehorned into places it doesn't belong
- Re: (Score:2)
  
  by PeeAitchPee ( 712652 ) writes:
  
  Different tools for different jobs. For tried-and-true "standard" application dev where concepts like referential integrity matter quite a bit, then yeah, relational databases are the way to go. For storing and searching across billions of JSON-based records very quickly (not necessarily objects you made yourself per se, but for example, JSON output coming from various automated feeds), MongoDB is outstanding and scales extremely well.
  - Re: (Score:2)
    
    by Joey Vegetables ( 686525 ) writes:
    
    I'm sure that's true; however, if Freedom/Openness matter, there are other comparable tools that still are FOSS.
  - Re: (Score:2)
    
    by sfcat ( 872532 ) writes:
    
    For storing and searching across billions of JSON-based records very quickly (not necessarily objects you made yourself per se, but for example, JSON output coming from various automated feeds), MongoDB is outstanding and scales extremely well.
    Just because you think your 1000 node MongoDB cluster doing that query in 5 seconds is good doesn't mean it is. How many rows/core sec is your cluster doing? If its less than 1,000,000/core sec, your system is slow and you should have used a relational DB. Wait, you said you were using MongoDB so I already know the answer to my question. Do you like wasting energy and your employer's money? You seem to brag about it. My Amazon stock thanks you for your donation.
  - Re: (Score:2)
    
    by cstacy ( 534252 ) writes:
    
    MongoDB is outstanding and scales extremely well.
    It is web scale [youtube.com]
- - Re: In any sane world... (Score:2)
    
    by ttfkam ( 37064 ) writes:
    
    You say that like relational and transactional (aka OLTP) are synonymous. They are not.
    Redshift: column-oriented analytic RELATIONAL database (OLAP)
    TimescaleDB: time series RELATIONAL database
    YugabyteDB: distributed multi-master with horizontal scaling
    And those are just some of the PostgreSQL-compatible offerings, hardly an exhaustive list of all SQL-compatible offerings out there from multiple vendors.
- Re: (Score:2)
  
  by im_thatoneguy ( 819432 ) writes:
  
  Trying to standardize all data in the universe into a specific set of N columns is the real insanity.
  Relational databases are awesome for perfectly defined data you need to store a crap ton of. But there's tons of data that doesn't neatly fit into a box.
  - Re: (Score:2)
    
    by sfcat ( 872532 ) writes:
    
    Sure it does. Every time someone claims it isn't, I run a little utility I wrote to analyze the Json and discover there is only 3 or 5 or some single digit number of use cases in their data store. I have heard this story many times. It has never been true for somewhere I worked. It probably isn't for you either. But you think, "it doesn't matter". Really? Because you are giving up 90% of your performance to have a bunch of Json instead of a relational data query. You seriously think that is a good t
- Re: (Score:2)
  
  by Joey Vegetables ( 686525 ) writes:
  
  I think I agree, but maybe for different reasons.
  I'm not necessarily against non-relational databases, although I've rarely encountered any situation where a relational database engine could not be made to work pretty easily and effectively even if the nature of the data to be stored can be modeled as key-value pairs and/or documents.
  However, for both philosophical reasons, and practical ones, I won't go non-FOSS if I don't have to. MongoDB's current license is neither Free nor Open Source, according to th
- Good for you? (Score:1)
  
  by Joviex ( 976416 ) writes:
  
  ...MongoDB would be regarded as humor....
  I guess you would attempt to build a house with a hammer and nails only as well.
  
  Only Morons blame their tools.
  It amazes me that serious software developers ever choose Mongo.
  It amazes me there are morons in my field, but hey, I guess it's the same everywhere.
  - Re: (Score:2)
    
    by Joey Vegetables ( 686525 ) writes:
    
    We don't always get to choose our tools. When I can, I go Free / Open source if it is possible. For both practical, and ethical, reasons.
- Re: (Score:2)
  
  by thegarbz ( 1787294 ) writes:
  
  It amazes me that serious software developers ever choose Mongo.
  It amazes me that someone with your relatively low UID still speaks in absolutes without ever considering a use case. I'd except that from some 15 year old with a UID in the 8million, but you should have grown beyond such stupid assertions by now.
  - Re: (Score:2)
    
    by Brain-Fu ( 1274756 ) writes:
    
    Well it certainly looks like I stepped on a lot of toes, given the angry mods and posts, including personal attacks on me like the one you made. Strong opinions are characteristic of slashdot though. It comes with the territory.
    Just look at what happens to any poor fool who voices support for systemd!
    Anyway, I am sure everyone's toes will heal up just fine.
    - Re: (Score:2)
      
      by thegarbz ( 1787294 ) writes:
      
      including personal attacks on me like the one you made.
      Oh I'm not attacking you, I'm just calling you out on what you said. You made an assumption and it came across looking incredibly stupid. You didn't step on toes, you just punked yourself with your high-and-mighty attitude.
      - Re: (Score:2)
        
        by Brain-Fu ( 1274756 ) writes:
        
        Sounds like several distinctions-without-a-difference to me.
    - Re: (Score:2)
      
      by account_deleted ( 4530225 ) writes:
      
      Comment removed based on user account deletion
      - Re: In any sane world... (Score:2)
        
        by ttfkam ( 37064 ) writes:
        
        There's definitely use cases for NoSQL. That doesn't necessarily mean that all NoSQL offerings are worthwhile. I for one like DynamoDB and Redis for certain specific use cases. MongoDB on the other hand can kick rocks. I'd rather use Postgres with indexed jsonb columns over MongoDB every day of the week and twice on Sundays.
        Because one day, you're gonna need something beyond JSON, and the MongoDB dev will not be happy on that day. They'll make excuses, wrap their data in makeshift JSON, tie their code in sp
        
        Re: (Score:2)
        
        by Joey Vegetables ( 686525 ) writes:
        
        That's my experience as well. There are hypothetical cases where a document store could offer advantages over an RDBMS, but in 30 years of doing data and app development, I've never encountered one.
        Also, in my experience, there are a LOT of devs who would rather write thousands of lines of buggy code to re-implement features already present in any RDBMS, than to spend the couple days it would take to learn SQL. Same with ORMs. There are valid use cases for these - though, again, I've yet to encounter one
- Career DB Expert Here...you're very incorrect (Score:4, Interesting)
  
  by Somervillain ( 4719341 ) writes: on Tuesday June 07, 2022 @03:39PM (#62601000)
  
  Relational database design, with good indexing and transaction management, has proven to be the best way to build databases. It amazes me that serious software developers ever choose Mongo.
  The dumbest mistake a professional could make would be using the wrong DB for their model. The relational model is great for what it's designed for: strongly typed and consistent data. If you're managing product inventory or financial transactions, it cannot be beat. It was very much optimized for those use cases.
  
  For my problem domain and nearly all I've worked on, the data model wasn't so rigid. I manage complex and dynamic data from customers. The structure will vary drastically based on what the customer paid for and how they choose to use our product. A document data model, like Mongo is much better suited for our model. I've seen it done both ways. You can store it in a single Mongo collection(table) or 1000 relational tables in Oracle and get 100x the performance in Mongo for a fraction of the cost and about 1/100th of the total Java code needed to manage it.
  
  To correctly model all use cases, we need approx 1000 tables. In the end, those tables are just mashed together to form one big JSON document. About 25-50 REST endpoints manage the individual facets of the JSON document.
  
  Because someone was an idiot and said "Oracle is the best, we should always use Oracle" we maintain massive amounts of JPA data to massage this dynamic data into something regular and relational. When all you have is a hammer, everything looks like a nail. We have massive codebases that do nothing but take JSON, break it into JPA models that map to our tables, breaking them into 100s of pieces, then stitching them back together in the EXACT same format...never actually using those tiny pieces they broke them into. It's dysfunctional, expensive, our users suffer, our cloud computing bills are through the roof, our licensing costs are insane (oracle is EXPENSIVE), but it definitely is properly relational. And, no, no amount of denormalization can fix it. The only way to fix it is use a document DB or basically make design our relational tables to make them a homemade document DB.
  
  Prior to my current job, I worked in healthcare software. It's a similar situation. Healthcare record databases are MASSIVE and dynamic. The record used by an oncology patient is much different than a dermatology patient. You can do things 2 ways:...either force a rigid unnatural structure on the data coming in so it can fit your DB model, which is how shitty healthcare systems operate with all their weird arbitrary codes...or you can let each practice dicate the model, but then it become a massive complex dataset that makes no sense and is nearly impossible to maintain....or you can use a document database, like Mongo that allows more flexibility and lets the Oncology dept store data in their format and the radiology dept store theirs in another and the DB can largely be agnostic to it.
  
  Mongo is meant for dynamic structures. Relational DBs are meant for rigid structures. Cassandra is optimized for different problems than either.
  
  Database types are a lot like transportation types. I view Oracle as the semi-truck. It's great. It's powerful, but you'd never want to commute to work in a semi-truck. You'd never want to drive around the beach in a semi-truck. You wouldn't want to deliver packages or pizzas in a semi-truck. A semi-truck is great for it's use case. A flatbed truck is better for others. A passenger car is better for hauling people and where I live, a bike is the best way to get around. It's stupid to make a semi truck do the job of a corolla or a bike and stupid to have a bike deliver freight.
  
  Pick the right tool for the job. Each major DB model has its strengths and weaknesses and is indispensable in some situation and horrible in others. For most jobs I've worked on, Mongo is much better than a relational DB, because I am usua
  Read the rest of this comment...
  
  - Re: (Score:2)
    
    by Pascoea ( 968200 ) writes:
    
    You'd never want to drive around the beach in a semi-truck.
    Speak for yourself. That sounds like it could be fun.
  - Re: Career DB Expert Here...you're very incorrect (Score:2)
    
    by ttfkam ( 37064 ) writes:
    
    Most major relational databases today have an indexable JSON data type. You don't need a thousand tables with hundreds of foreign keys.
    MongoDB is a one-hit wonder whose fifteen minutes were up quite a while ago. (I also find it difficult to forgive MongoDB for the numerous data integrity bugs they've had over the years.)
yes, sir, we have NoSQL (Score:2)

by Anonymouse Cowtard ( 6211666 ) writes:

So here's some bananas.
Big question (Score:2)

by jd ( 1658 ) writes:

Is the encryption order-preserving? (Reduces the need to decrypt the data to perform operations)
- - - - Re: (Score:2)
        
        by lsllll ( 830002 ) writes:
        
        The way data is exposed is almost ALWAYS via a compromised "client" machine talking to a database server (other than unencrypted database dumps I supposed). So, at the end of the day, it doesn't matter if the data is encrypted on the database. If the "client" machine (which would be something like the web server) has a key to decode the data, then the developer could still be on the hook for writing insecure code that allowed the hackers in in the first place.
        
        Re: (Score:1)
        
        by falzer ( 224563 ) writes:
        
        >The way data is exposed is almost ALWAYS via a compromised "client" machine talking to a database server
        Are there published stats on that or is that a hunch? I don't actually know.
        >(other than unencrypted database dumps I supposed).
        Yes, that's one of the reasons of having encrypted data at rest. Another is hardware theft, another is being able to store data with a host you don't completely trust.
        All the cloud data and backup providers aren't just going to stop encrypting data at rest just bec
        
        Re: (Score:2)
        
        by lsllll ( 830002 ) writes:
        
        I never said encrypting data wasn't a good thing. Every-single-computer-installation I do, including desktops, encrypts /home and the places where data exists. Heck! I even symbolic link the /root/.ssh directory to some place on an encrypted volume, in case the box is physically stolen. But what I said still stands. With the exception of theft, it doesn't matter if your data is encrypted or not. If the machine is operating, then it's not encrypted. The only exception to that is when the actual data i
  - Re:Big question (Score:4, Interesting)
    
    by gweihir ( 88907 ) writes: on Tuesday June 07, 2022 @02:11PM (#62600730)
    
    Pretty much. If we assume one of the usual encryption modes (CBC, OFB, CFB, GCM, etc.), they will need to be using a constant IV for "deterministic", otherwise they cannot do equality testing and searching without decryption on the server. Of course, the one thing an IV absolutely must fulfill for security to be good is that it must only be used once with a given encryption key. Otherwise you essentially fall back to ECB for the prefix of the two values that are the same (rounded down to full cipher block sizes). I.e. values that have the same prefix will have the same prefix in their encrypted forms. That is an exceptionally bad idea.
    Of course, they could be using a mode that is secure when used without an IV or with a constant IV, like used in disk encryption, e.g. EME mode or XTS mode. In that case only the full same values encrypt to the full same cipher text. This obviously cannot be avoided for the given application scenario.
    But here is what the documentation list as the only allowed mode: AEAD_AES_256_CBC_HMAC_SHA_512-Deterministic
    (https://www.mongodb.com/docs/v6.0/core/csfle/reference/encryption-schemas/)
    That would mean CBC mode with a constant IV and a massive security problem as a result.
    In essence, the user now _has_ to be an encryption expert to understand in which way exactly this approach is insecure and to assess the risks caused by that.
    No, no actual cryptography expert would be caught dead doing something this stupid and then implying it is secure.
    
    - Re: (Score:2)
      
      by WaffleMonster ( 969671 ) writes:
      
      Of course, the one thing an IV absolutely must fulfill for security to be good is that it must only be used once with a given encryption key. Otherwise you essentially fall back to ECB for the prefix of the two values that are the same (rounded down to full cipher block sizes). I.e. values that have the same prefix will have the same prefix in their encrypted forms. That is an exceptionally bad idea.
      The IV in this case happens to be derived from a truncated HMAC over the plaintext. There is no realistic risk of reuse separate from encryption of the same exact plaintext.
      Obviously not as good as a random IV for some it may be worthwhile tradeoff vs data tier having access to keys.
      - Re: (Score:2)
        
        by gweihir ( 88907 ) writes:
        
        Well, better than constant, but really hard to analyze what the risk is. Why have they not uses something established and known to be secure like XTS mode? Home-cocked crypto fails in the most surprising ways.
  - Re: (Score:2)
    
    by holophrastic ( 221104 ) writes:
    
    ...that would be the "encryption" from world war 2. It worked great back then. It's not considered to be encryption anymore -- since common statistical characteristics of a given language are enough to decrypt so much of the data that it becomes completely meaningless for security.
- Re: (Score:2)
  
  by gweihir ( 88907 ) writes:
  
  Does not need to be. All you need is comparison for equality. You can sort or hash the encrypted table elements by their encrypted values. You just need to encrypt all of them with the same key and the same IV (bad!) if your encryption scheme needs an IV.
  - Re: (Score:2)
    
    by jd ( 1658 ) writes:
    
    When it's a key/value database, equality will tell you when a key matches, yes. Assuming, as you say, you use the same key and IV.
    This article looks interesting - a pseudo-public-key OPE system: https://dl.acm.org/doi/10.1145... [acm.org]
    This helps when you have to use the same keys everywhere, as the encryption key can't be used to decrypt the data.
Always encrypted? (Score:2)

by holophrastic ( 221104 ) writes:

I guess I'm a little confused. If I ask my DB for all of the contacts with a first name containing the letter "Q", I'm expecting it to only return that subset of my contacts. Obviously, it'll need to decrypt the contact names somewhere. So if the DB isn't doing it (which I've always thought was the efficient place to have it done, hence the indexing, and not transferring the many unmatched contacts), then which part is doing that hard work?
Something's weird here.
- Re: (Score:2)
  
  by lsllll ( 830002 ) writes:
  
  See my post below. I basically asked the same question. My thoughts are that the indexing would have to be done by the client (who has the data and the key) and passed on to the database. You'd have to know all the indexes you're going to have ahead of time so that they can be build and passed on to the database. If you ever need to build a new index, all the data would have to be returned to the client for it to build the index and return back to the database. And of course you'll run into the situati
  - Re: (Score:2)
    
    by holophrastic ( 221104 ) writes:
    
    ...and wouldn't that then mean that your indexes are effectively unencrypted? I mean, if my million contacts are indexed with enough information to be used, then there's enough data gleanable from it too.
    Having watched their lovely, and a little-too-fast animated diagram, it is indeed always encrypted at the db, with the decryption done elsewhere. So the security problem isn't at the db, it's at the elsewhere. Great.
Huh? (Score:2)

by lsllll ( 830002 ) writes:

This end-to-end client-side encryption uses novel encrypted index data structures, the data being searched remains encrypted at all times on the database server, including in memory and in the CPU. The keys never leave the application and the company maintains that the query speed nor overall application performance are impacted by the new feature.
The only way the database would have an encrypted index is if it had the data and the key (private or public) to encrypt it, which means it had access to the data at one point, or the onus of indexing and encrypting falls on the client. But for the client to do that, the client must have access to all the data. So is the whole database transmitted to the client in order for a new index to be built?
- Re: (Score:2)
  
  by gweihir ( 88907 ) writes:
  
  Not needed. But they have a massive security problem. As far as I could find out, they are using CBC mode with a constant key and IV. That way they can compare for equality (and sort or hash) on the server without decryption. But of course the one thing you must never do with an IV is to re-use it unless the key has changed.
Probably has massive security problems (Score:2)

by gweihir ( 88907 ) writes:

Going to reference my own explanation elsewhere in this comment section: https://slashdot.org/comments.... [slashdot.org]
Looks like no actual cryptography expert was involved in the design of this mechanism. As a result, the users now have to be cryptography experts to understand the implications.
- Re: (Score:2)
  
  by 93 Escort Wagon ( 326346 ) writes:
  
  Looks like no actual cryptography expert was involved in the design of this mechanism. As a result, the users now have to be cryptography experts to understand the implications.
  Given all the problems we've seen with MongoDB default installs over the past several years, this unfortunately doesn't surprise me in the least. As a matter of fact, it's basically what I expected from a new MongoDB "feature".
  On a side note: it's a good excuse to re-link to this [youtu.be]
  - Re: (Score:2)
    
    by gweihir ( 88907 ) writes:
    
    So you se a longer-term trend here? Does not surprise me.
- Re: (Score:3)
  
  by account_deleted ( 4530225 ) writes:
  
  Comment removed based on user account deletion
  - Re: (Score:2)
    
    by gweihir ( 88907 ) writes:
    
    Well, there _is_ a striking similarity in the two aspects...
  - Re: (Score:2)
    
    by sfcat ( 872532 ) writes:
    
    Looks like no actual cryptography expert was involved in the design of this mechanism. As a result, the users now have to be cryptography experts to understand the implications.
    Kinda makes sense given most NoSQL database management systems were built by teams with no actual database experts...
    (OK, OK, I'm going to hell...)
    Posting factual information is a reason to go to hell? That explains social media...

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

In any sane world... (Score:1, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: In any sane world... (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Good for you? (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: In any sane world... (Score:2)

Re: (Score:2)

Career DB Expert Here...you're very incorrect (Score:4, Interesting)

Re: (Score:2)

Re: Career DB Expert Here...you're very incorrect (Score:2)

yes, sir, we have NoSQL (Score:2)

Big question (Score:2)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re:Big question (Score:4, Interesting)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Always encrypted? (Score:2)

Re: (Score:2)

Re: (Score:2)

Huh? (Score:2)

Re: (Score:2)

Probably has massive security problems (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Related Links Top of the: day, week, month.

Slashdot Top Deals