Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
Databases IT Technology

MongoDB 6.0 Brings Encrypted Queries, Time-Series Data Collection (thenewstack.io) 53

The developers behind the open source MongoDB, and its commercial service counterpart MongoDB Atlas, have been busy making the document database easier to use for developers. From a report: Available in preview, Queryable Encryption provides the ability to query encrypted data, and with the entire query transaction be encrypted -- an industry first according to MongoDB. This feature will be of interest to organizations with a lot of sensitive data, such as banks, health care institutions and the government. This eliminates the need for developers to be experts in encryption, Davidson said. This end-to-end client-side encryption uses novel encrypted index data structures, the data being searched remains encrypted at all times on the database server, including in memory and in the CPU. The keys never leave the application and the company maintains that the query speed nor overall application performance are impacted by the new feature.

MongoDB is also now supporting time series data, which are important for monitoring physical systems, quick-moving financial data, or other temporally-oriented datasets. In MongoDB 6.0, time-series collections can have secondary indexes on measurements, and the database system has been optimized to sort time-based data more quickly. Although there are a number of databases specifically geared towards time-series data specifically, such as InfluxDB, many organizations may not want to stand-up an entire database system for this specific use, a separate system costing more in terms of support and expertise, Davidson argued. Another feature is Cluster-to-Cluster Synchronization, which provides the continuous data synchronization of MongoDB clusters across environments. It works with Atlas, in private cloud, on-premises, or on the edge. This sets the stage for using data in multiple places for testing, analytics, and backup.

This discussion has been archived. No new comments can be posted.

MongoDB 6.0 Brings Encrypted Queries, Time-Series Data Collection

Comments Filter:
  • ...MongoDB would be regarded as humor, similar to the intended-as-humor programming languages like Ook! [esolangs.org] or whitespace [wikipedia.org].

    Relational database design, with good indexing and transaction management, has proven to be the best way to build databases. It amazes me that serious software developers ever choose Mongo.

    • by jd ( 1658 )

      RDBMS are good where data can be compartmentalized. Blobs - there's a reason blobs aren't popular in such databases. Mongo is good for blobs that are indexed via relational databases. For caching things like credentials, I'd go for Memcached. No sense in using something heavy for something trivial.

    • by kunwon1 ( 795332 )
      RDBMS do many things better, but not everything better. Mongo has its place - unfortunately it's often shoehorned into places it doesn't belong
    • Different tools for different jobs. For tried-and-true "standard" application dev where concepts like referential integrity matter quite a bit, then yeah, relational databases are the way to go. For storing and searching across billions of JSON-based records very quickly (not necessarily objects you made yourself per se, but for example, JSON output coming from various automated feeds), MongoDB is outstanding and scales extremely well.
      • I'm sure that's true; however, if Freedom/Openness matter, there are other comparable tools that still are FOSS.
      • by sfcat ( 872532 )

        For storing and searching across billions of JSON-based records very quickly (not necessarily objects you made yourself per se, but for example, JSON output coming from various automated feeds), MongoDB is outstanding and scales extremely well.

        Just because you think your 1000 node MongoDB cluster doing that query in 5 seconds is good doesn't mean it is. How many rows/core sec is your cluster doing? If its less than 1,000,000/core sec, your system is slow and you should have used a relational DB. Wait, you said you were using MongoDB so I already know the answer to my question. Do you like wasting energy and your employer's money? You seem to brag about it. My Amazon stock thanks you for your donation.

      • by cstacy ( 534252 )

        MongoDB is outstanding and scales extremely well.

        It is web scale [youtube.com]

    • Trying to standardize all data in the universe into a specific set of N columns is the real insanity.

      Relational databases are awesome for perfectly defined data you need to store a crap ton of. But there's tons of data that doesn't neatly fit into a box.

      • by sfcat ( 872532 )
        Sure it does. Every time someone claims it isn't, I run a little utility I wrote to analyze the Json and discover there is only 3 or 5 or some single digit number of use cases in their data store. I have heard this story many times. It has never been true for somewhere I worked. It probably isn't for you either. But you think, "it doesn't matter". Really? Because you are giving up 90% of your performance to have a bunch of Json instead of a relational data query. You seriously think that is a good t
    • I think I agree, but maybe for different reasons.

      I'm not necessarily against non-relational databases, although I've rarely encountered any situation where a relational database engine could not be made to work pretty easily and effectively even if the nature of the data to be stored can be modeled as key-value pairs and/or documents.

      However, for both philosophical reasons, and practical ones, I won't go non-FOSS if I don't have to. MongoDB's current license is neither Free nor Open Source, according to th

    • ...MongoDB would be regarded as humor....

      I guess you would attempt to build a house with a hammer and nails only as well.

      Only Morons blame their tools.

      It amazes me that serious software developers ever choose Mongo.

      It amazes me there are morons in my field, but hey, I guess it's the same everywhere.

      • We don't always get to choose our tools. When I can, I go Free / Open source if it is possible. For both practical, and ethical, reasons.
    • It amazes me that serious software developers ever choose Mongo.

      It amazes me that someone with your relatively low UID still speaks in absolutes without ever considering a use case. I'd except that from some 15 year old with a UID in the 8million, but you should have grown beyond such stupid assertions by now.

      • Well it certainly looks like I stepped on a lot of toes, given the angry mods and posts, including personal attacks on me like the one you made. Strong opinions are characteristic of slashdot though. It comes with the territory.

        Just look at what happens to any poor fool who voices support for systemd!

        Anyway, I am sure everyone's toes will heal up just fine.

        • including personal attacks on me like the one you made.

          Oh I'm not attacking you, I'm just calling you out on what you said. You made an assumption and it came across looking incredibly stupid. You didn't step on toes, you just punked yourself with your high-and-mighty attitude.

        • Comment removed based on user account deletion
          • There's definitely use cases for NoSQL. That doesn't necessarily mean that all NoSQL offerings are worthwhile. I for one like DynamoDB and Redis for certain specific use cases. MongoDB on the other hand can kick rocks. I'd rather use Postgres with indexed jsonb columns over MongoDB every day of the week and twice on Sundays.

            Because one day, you're gonna need something beyond JSON, and the MongoDB dev will not be happy on that day. They'll make excuses, wrap their data in makeshift JSON, tie their code in sp

            • That's my experience as well. There are hypothetical cases where a document store could offer advantages over an RDBMS, but in 30 years of doing data and app development, I've never encountered one.

              Also, in my experience, there are a LOT of devs who would rather write thousands of lines of buggy code to re-implement features already present in any RDBMS, than to spend the couple days it would take to learn SQL. Same with ORMs. There are valid use cases for these - though, again, I've yet to encounter one

    • by Somervillain ( 4719341 ) on Tuesday June 07, 2022 @03:39PM (#62601000)

      Relational database design, with good indexing and transaction management, has proven to be the best way to build databases. It amazes me that serious software developers ever choose Mongo.

      The dumbest mistake a professional could make would be using the wrong DB for their model. The relational model is great for what it's designed for: strongly typed and consistent data. If you're managing product inventory or financial transactions, it cannot be beat. It was very much optimized for those use cases.

      For my problem domain and nearly all I've worked on, the data model wasn't so rigid. I manage complex and dynamic data from customers. The structure will vary drastically based on what the customer paid for and how they choose to use our product. A document data model, like Mongo is much better suited for our model. I've seen it done both ways. You can store it in a single Mongo collection(table) or 1000 relational tables in Oracle and get 100x the performance in Mongo for a fraction of the cost and about 1/100th of the total Java code needed to manage it.

      To correctly model all use cases, we need approx 1000 tables. In the end, those tables are just mashed together to form one big JSON document. About 25-50 REST endpoints manage the individual facets of the JSON document.

      Because someone was an idiot and said "Oracle is the best, we should always use Oracle" we maintain massive amounts of JPA data to massage this dynamic data into something regular and relational. When all you have is a hammer, everything looks like a nail. We have massive codebases that do nothing but take JSON, break it into JPA models that map to our tables, breaking them into 100s of pieces, then stitching them back together in the EXACT same format...never actually using those tiny pieces they broke them into. It's dysfunctional, expensive, our users suffer, our cloud computing bills are through the roof, our licensing costs are insane (oracle is EXPENSIVE), but it definitely is properly relational. And, no, no amount of denormalization can fix it. The only way to fix it is use a document DB or basically make design our relational tables to make them a homemade document DB.

      Prior to my current job, I worked in healthcare software. It's a similar situation. Healthcare record databases are MASSIVE and dynamic. The record used by an oncology patient is much different than a dermatology patient. You can do things 2 ways:...either force a rigid unnatural structure on the data coming in so it can fit your DB model, which is how shitty healthcare systems operate with all their weird arbitrary codes...or you can let each practice dicate the model, but then it become a massive complex dataset that makes no sense and is nearly impossible to maintain....or you can use a document database, like Mongo that allows more flexibility and lets the Oncology dept store data in their format and the radiology dept store theirs in another and the DB can largely be agnostic to it.

      Mongo is meant for dynamic structures. Relational DBs are meant for rigid structures. Cassandra is optimized for different problems than either.

      Database types are a lot like transportation types. I view Oracle as the semi-truck. It's great. It's powerful, but you'd never want to commute to work in a semi-truck. You'd never want to drive around the beach in a semi-truck. You wouldn't want to deliver packages or pizzas in a semi-truck. A semi-truck is great for it's use case. A flatbed truck is better for others. A passenger car is better for hauling people and where I live, a bike is the best way to get around. It's stupid to make a semi truck do the job of a corolla or a bike and stupid to have a bike deliver freight.

      Pick the right tool for the job. Each major DB model has its strengths and weaknesses and is indispensable in some situation and horrible in others. For most jobs I've worked on, Mongo is much better than a relational DB, because I am usua

      • by Pascoea ( 968200 )

        You'd never want to drive around the beach in a semi-truck.

        Speak for yourself. That sounds like it could be fun.

      • Most major relational databases today have an indexable JSON data type. You don't need a thousand tables with hundreds of foreign keys.

        MongoDB is a one-hit wonder whose fifteen minutes were up quite a while ago. (I also find it difficult to forgive MongoDB for the numerous data integrity bugs they've had over the years.)

  • So here's some bananas.
  • Is the encryption order-preserving? (Reduces the need to decrypt the data to perform operations)

    • by gweihir ( 88907 )

      Does not need to be. All you need is comparison for equality. You can sort or hash the encrypted table elements by their encrypted values. You just need to encrypt all of them with the same key and the same IV (bad!) if your encryption scheme needs an IV.

      • by jd ( 1658 )

        When it's a key/value database, equality will tell you when a key matches, yes. Assuming, as you say, you use the same key and IV.

        This article looks interesting - a pseudo-public-key OPE system: https://dl.acm.org/doi/10.1145... [acm.org]

        This helps when you have to use the same keys everywhere, as the encryption key can't be used to decrypt the data.

  • I guess I'm a little confused. If I ask my DB for all of the contacts with a first name containing the letter "Q", I'm expecting it to only return that subset of my contacts. Obviously, it'll need to decrypt the contact names somewhere. So if the DB isn't doing it (which I've always thought was the efficient place to have it done, hence the indexing, and not transferring the many unmatched contacts), then which part is doing that hard work?

    Something's weird here.

    • by lsllll ( 830002 )

      See my post below. I basically asked the same question. My thoughts are that the indexing would have to be done by the client (who has the data and the key) and passed on to the database. You'd have to know all the indexes you're going to have ahead of time so that they can be build and passed on to the database. If you ever need to build a new index, all the data would have to be returned to the client for it to build the index and return back to the database. And of course you'll run into the situati

      • ...and wouldn't that then mean that your indexes are effectively unencrypted? I mean, if my million contacts are indexed with enough information to be used, then there's enough data gleanable from it too.

        Having watched their lovely, and a little-too-fast animated diagram, it is indeed always encrypted at the db, with the decryption done elsewhere. So the security problem isn't at the db, it's at the elsewhere. Great.

  • by lsllll ( 830002 )

    This end-to-end client-side encryption uses novel encrypted index data structures, the data being searched remains encrypted at all times on the database server, including in memory and in the CPU. The keys never leave the application and the company maintains that the query speed nor overall application performance are impacted by the new feature.

    The only way the database would have an encrypted index is if it had the data and the key (private or public) to encrypt it, which means it had access to the data at one point, or the onus of indexing and encrypting falls on the client. But for the client to do that, the client must have access to all the data. So is the whole database transmitted to the client in order for a new index to be built?

    • by gweihir ( 88907 )

      Not needed. But they have a massive security problem. As far as I could find out, they are using CBC mode with a constant key and IV. That way they can compare for equality (and sort or hash) on the server without decryption. But of course the one thing you must never do with an IV is to re-use it unless the key has changed.

  • Going to reference my own explanation elsewhere in this comment section: https://slashdot.org/comments.... [slashdot.org]

    Looks like no actual cryptography expert was involved in the design of this mechanism. As a result, the users now have to be cryptography experts to understand the implications.

    • Looks like no actual cryptography expert was involved in the design of this mechanism. As a result, the users now have to be cryptography experts to understand the implications.

      Given all the problems we've seen with MongoDB default installs over the past several years, this unfortunately doesn't surprise me in the least. As a matter of fact, it's basically what I expected from a new MongoDB "feature".

      On a side note: it's a good excuse to re-link to this [youtu.be]

    • Comment removed based on user account deletion
      • by gweihir ( 88907 )

        Well, there _is_ a striking similarity in the two aspects...

      • by sfcat ( 872532 )

        Looks like no actual cryptography expert was involved in the design of this mechanism. As a result, the users now have to be cryptography experts to understand the implications.

        Kinda makes sense given most NoSQL database management systems were built by teams with no actual database experts...

        (OK, OK, I'm going to hell...)

        Posting factual information is a reason to go to hell? That explains social media...

Real Programmers don't write in PL/I. PL/I is for programmers who can't decide whether to write in COBOL or FORTRAN.

Working...