将 JSON 存储在数据库中与每个键都有一个新列
我正在实现以下模型,用于在我的表中存储用户相关数据 - 我有 2 列 - uid
(主键)和一个存储其他数据的 meta
列关于 JSON 格式的用户.
I am implementing the following model for storing user related data in my table - I have 2 columns - uid
(primary key) and a meta
column which stores other data about the user in JSON format.
uid | meta
--------------------------------------------------
1 | {name:['foo'],
| emailid:['foo@bar.com','bar@foo.com']}
--------------------------------------------------
2 | {name:['sann'],
| emailid:['sann@bar.com','sann@foo.com']}
--------------------------------------------------
这是比每个属性一列模型更好的方法(性能方面,设计方面)吗,其中表将包含许多列,例如 uid
、name
,emailid
.
Is this a better way (performance-wise, design-wise) than the one-column-per-property model, where the table will have many columns like uid
, name
, emailid
.
我喜欢第一个模型的地方是,您可以添加尽可能多的字段,没有限制.
What I like about the first model is, you can add as many fields as possible there is no limitation.
另外,我想知道,现在我已经实现了第一个模型.如何对其执行查询,例如,我想获取所有名称为 'foo' 的用户?
Also, I was wondering, now that I have implemented the first model. How do I perform a query on it, like, I want to fetch all the users who have name like 'foo'?
问题 - 使用 JSON 或每字段列在数据库中存储用户相关数据(请记住,字段数量不固定)的更好方法是什么?另外,如果实现了第一个模型,如何如上所述查询数据库?我是否应该使用这两种模型,通过将查询可能搜索到的所有数据存储在单独的行中并将其他数据存储在 JSON 中(是不同的行)?
Question - Which is the better way to store user related data (keeping in mind that number of fields is not fixed) in database using - JSON or column-per-field? Also, if the first model is implemented, how to query database as described above? Should I use both the models, by storing all the data which may be searched by a query in a separate row and the other data in JSON (is a different row)?
由于我需要执行搜索的列不会太多,因此使用这两种模型是否明智?我需要搜索的数据的每列键和其他数据的 JSON(在同一个 MySQL 数据库中)?
Since there won't be too many columns on which I need to perform search, is it wise to use both the models? Key-per-column for the data I need to search and JSON for others (in the same MySQL database)?
推荐答案
2017 年 6 月 4 日更新
鉴于这个问题/答案已经获得了一定的人气,我认为值得更新.
Updated 4 June 2017
Given that this question/answer have gained some popularity, I figured it was worth an update.
最初发布此问题时,MySQL 不支持 JSON 数据类型,而 PostgreSQL 中的支持还处于起步阶段.从 5.7 开始,MySQL 现在支持 JSON 数据类型(在二进制存储格式),而 PostgreSQL JSONB 已经显着成熟.这两种产品都提供了可以存储任意文档的高性能 JSON 类型,包括支持对 JSON 对象的特定键进行索引.
When this question was originally posted, MySQL had no support for JSON data types and the support in PostgreSQL was in its infancy. Since 5.7, MySQL now supports a JSON data type (in a binary storage format), and PostgreSQL JSONB has matured significantly. Both products provide performant JSON types that can store arbitrary documents, including support for indexing specific keys of the JSON object.
但是,我仍然坚持我最初的说法,即在使用关系数据库时,您的默认首选项仍应为 column-per-value.关系数据库仍然建立在假设它们中的数据将被很好地标准化的假设之上.查询计划器在查看列时比查看 JSON 文档中的键时具有更好的优化信息.可以在列之间创建外键(但不能在 JSON 文档中的键之间).重要的是:如果您的大部分架构都足够不稳定,足以证明使用 JSON 是合理的,那么您可能至少需要考虑关系数据库是否是正确的选择.
However, I still stand by my original statement that your default preference, when using a relational database, should still be column-per-value. Relational databases are still built on the assumption of that the data within them will be fairly well normalized. The query planner has better optimization information when looking at columns than when looking at keys in a JSON document. Foreign keys can be created between columns (but not between keys in JSON documents). Importantly: if the majority of your schema is volatile enough to justify using JSON, you might want to at least consider if a relational database is the right choice.
也就是说,很少有应用程序是完全关系型或面向文档的.大多数应用程序都有两者的混合.以下是我个人发现 JSON 在关系数据库中有用的一些示例:
That said, few applications are perfectly relational or document-oriented. Most applications have some mix of both. Here are some examples where I personally have found JSON useful in a relational database:
在存储联系人的电子邮件地址和电话号码时,将它们作为值存储在 JSON 数组中比多个单独的表更易于管理
When storing email addresses and phone numbers for a contact, where storing them as values in a JSON array is much easier to manage than multiple separate tables
保存任意键/值用户首选项(其中值可以是布尔值、文本或数字,并且您不希望为不同的数据类型设置单独的列)
Saving arbitrary key/value user preferences (where the value can be boolean, textual, or numeric, and you don't want to have separate columns for different data types)
存储没有定义架构的配置数据(如果您正在构建 Zapier 或 IFTTT,并且需要为每个集成存储配置数据)
Storing configuration data that has no defined schema (if you're building Zapier, or IFTTT and need to store configuration data for each integration)
我相信还有其他的,但这些只是几个简单的例子.
I'm sure there are others as well, but these are just a few quick examples.
如果您真的希望能够不受限制地添加任意数量的字段(任意文档大小限制除外),请考虑使用 NoSQL 解决方案,例如 MongoDB.
If you really want to be able to add as many fields as you want with no limitation (other than an arbitrary document size limit), consider a NoSQL solution such as MongoDB.
对于关系数据库:每个值使用一列.将 JSON blob 放在列中几乎不可能进行查询(当您真正找到有效的查询时会非常缓慢).
For relational databases: use one column per value. Putting a JSON blob in a column makes it virtually impossible to query (and painfully slow when you actually find a query that works).
关系数据库在编制索引时利用数据类型,并且旨在通过规范化结构实现.
Relational databases take advantage of data types when indexing, and are intended to be implemented with a normalized structure.
附带说明:这并不是说您永远不应该将 JSON 存储在关系数据库中.如果您要添加真正的元数据,或者如果您的 JSON 描述的信息不需要查询并且仅用于显示,则为所有数据创建单独的列可能有点过头了点.
As a side note: this isn't to say you should never store JSON in a relational database. If you're adding true metadata, or if your JSON is describing information that does not need to be queried and is only used for display, it may be overkill to create a separate column for all of the data points.
相关文章