- user: m001-student
- pass: m001-mongodb-basics
- type this:
mongo "mongodb+srv://<username>:<password>@<cluster>.mongodb.net/admin"
- document: has a field (key) and a value
- collection: has one or many documents
- database: has one or more collection
- cluster: group of servers that store your data
- How to set up:
- connect to mongo shell
- BSON is a converted, more compact form of JSON
- JSON has:
- mongoimport
- mongoexport
- BSON has:
- mongodump
- mongorestore
mongodump --uri "mongodb+srv://<your username>:<your password>@<your cluster>.mongodb.net/sample_supplies"
mongoexport --uri="mongodb+srv://<your username>:<your password>@<your cluster>.mongodb.net/sample_supplies" --collection=sales --out=sales.json
mongorestore --uri "mongodb+srv://<your username>:<your password>@<your cluster>.mongodb.net/sample_supplies" --drop dump
mongoimport --uri="mongodb+srv://<your username>:<your password>@<your cluster>.mongodb.net/sample_supplies" --drop sales.json
- HOW TO FIND SOMETHING IN THE COLLECTION:
mongo "mongodb+srv://m001-student:m001-mongodb-basics@sandbox.mongodb.net/admin"
show dbs
(shows you all of the databases you have, ex: grades, students, teachers)use sample_training
select one of the databasesshow collections
shows you the group of JSON files in the databasedb.zips.find({"state": "NY"})
searches collection for documents with "state" : "NY"db.zips.find({"state": "NY"}).pretty()
formats the JSON in a readable wayit
iterates to the next page of results in the collection
- every document MUST have a unique _id value
- this allows for the same exact fields in different document
- HOW TO INSERT A SINGLE DOCUMENT IN THE COLLECTION:
mongoimport --uri="mongodb+srv://m001-student:m001-mongodb-basics@sandbox.mongodb.net/sample_supplies" --drop sales.json
- connect to atlas cluster:
mongo "mongodb+srv://<username>:<password>@<cluster>.mongodb.net/admin"
- look for the collection we need:
use sample_training
- get a random document from collection:
db.inspections.findOne()
db.inspections.insert({ "_id" : ObjectId("56d61033a378eccde8a8354f"), "id" : "10021-2015-ENFO", "certificate_number" : 9278806, "business_name" : "ATLIXCO DELI GROCERY INC.", "date" : "Feb 20 2015", "result" : "No Violation Issued", "sector" : "Cigarette Retail Dealer - 127", "address" : { "city" : "RIDGEWOOD", "zip" : 11385, "street" : "MENAHAN ST", "number" : 1712 }})
insert the document- to check if it is inserted:
db.inspections.find({"id" : "10021-2015-ENFO", "certificate_number" : 9278806}).pretty()
- identical documents CAN exist in the collection as long as their _id values are different!
- HOW TO INSERT MUTLIPLE DOCUMENTS:
db.inspections.insert([ { "test": 1 }, { "test": 2 }, { "test": 3 } ])
inserts 3 documentsdb.inspections.insert([{ "_id": 1, "test": 1 },{ "_id": 1, "test": 2 },{ "_id": 3, "test": 3 }])
inserts documents with id- to insert SIMILAR documents, "ordered": false ->
db.inspections.insert([{ "_id": 1, "test": 1 },{ "_id": 1, "test": 2 },{ "_id": 3, "test": 3 }],{ "ordered": false})
- to insert UNIQUE documents, "ordered": true ->
db.inspection.insert([{ "_id": 1, "test": 1 },{ "_id": 3, "test": 3 }])
- *** IMPORTANT **** : if you accidentally make a new field, it won't give you an error, but instead give you a new field!!
- findOne() looks for a document that matches the query
- updateOne("_id":) if multiple documents match the query, only ONE will be updated,
- updateMany() updates all documents that match the query
- HOW TO UPDATE A DOCUMENT:
- connect to atlas cluster:
mongo "mongodb+srv://<username>:<password>@<cluster>.mongodb.net/admin"
- select the collection:
use sample_training
- finds documents with zip codes
db.zips.find({"zip":"1234"}).pretty()
- counts how many documents have the city HUDSON ->
db.zips.find({"city":HUDSON"}).count()
- to update ALL of them ->
db.zips.updateMany({"city":HUDSON"}), {"$inc":{"pop":10}})
- $inc is an update operator that increments the value in a field by some amount
- allows us to update MANY documents at the same time
- format -> {"$inc": {"pop": 10, "field2": increment value, ...}}
- to update ONLY ONE document:
db.zips.updateOne({"zip":"1234"}, {"$set": {"pop":17630}})
- $set updates the given value of a field
- to ADD a new field:
db.grades.updateOne({"student_id":250, "class_id":339}, {"$push": {"scores": {"type": "extra credit", "score": 100}}}
- connect to atlas cluster:
- deleteOne() deletes ONE document, RECOMMENDED to use deleteOne("_id":) to verify you delete a single document
- deleteMany() deletes MULTIPLE documents
- HOW TO DELETE MULTIPLE DOCUMENTS:
- get the collection:
use sample_training
- look for the document to delete:
db.inspections.find({"test":1}).pretty()
- look for another document:
db.inspections.find({"test":3}).pretty()
- delete documents w/ test 1:
db.inspections.deleteMany({"test":1})
- delete documents w/ test 3:
db.inspections.deleteMany({"test":3})
- to drop a collection from a database:
db.collection.drop()
- get the collection:
- ** IMPORTANT ** when all collections are dropped from a DB, the DB NO LONGER APPEARS in the list of databases
- review MQL operators:
- update operators:
$inc
,$set
,$unset
- query operators:
$eq
(equal to),$ne
(not equal to),$gt
(greater than),$lt
(less than),$gt
e (>=),$lte
(<=)
- update operators:
- used for more than one statement:
{{<operator> : [{statement1}, {statement2}, ...]}
- where is:
$and
$or
$nor
- used to negate:
{$not {statement}}
$not
- IMPLICIT $and is different!! can write it as:
{"$and" : [{"student_id": {"$gt": 25}}, {"student_id": {"$lt": 100}}] }
- OR (the better way) ->
{"student_id": {"$gt": 25, "$lt": 100}}
- $expr can be used as a variable
{ $expr: { <expression> } }
- ex:
{"$expr": {"$eq": ["$start station name", "$end station id"]}}
- here,
$start station name
is the VALUE of start station name, andend station id
is the VALUE - to add an element to an array OR turn a field into an array field:
$push
- when looking for a field in an array, you should list the elements the way they are shown in the document
<array field> : {"$size": <number>}}
returns documents where the array field is the length<array field> : {"$all": <array>}}
returns documents with given elements REGARDLESS of their order in array- ex: find documents with 20 amenities, including all of the amenities in the query array:
db.listingsAndReviews.find({ "amenities": { "$size": 20, "$all": [ "Internet", "Wifi", "Kitchen", "Heating", "Family/kid friendly", "Washer", "Dryer", "Essentials", "Shampoo", "Hangers", "Hair dryer", "Iron", "Laptop friendly workspace" ] } }).pretty()
- find a listing and review that accommodates more than 6 people and has 50 reviews:
db.listingsAndReviews.find({"accommodates": {"$gt": 6}, "reviews": {"$size": 50}}).count()
- Projection: specifies the fields that SHOULD or SHOULDNT be in the result -ex: db..find({ }, {}})
- Projection syntax:
db.<collection>.find({<query>}, {<projection>})
- 1: include in the field, 0: exclude the field
- ex:
db.<collection>.find({<query>}, {<field1>: 0, <field2>: 0})
- another ex: What if you wanted to find ONLY the names of companies with 8 funding rounds?
db.companies.find({ "funding_rounds": { "$size": 8 } }, { "name": 1, "_id": 0 })
- what if you want to find a SPECIFIC field in an array?
- we use
{<field>: "$elemMatch": {<field>: <value>}}}
- where field is the name of the array, and : is some item in the array
- ex:
db.grades.find({ "class_id": 431 }, { "scores": { "$elemMatch": { "score": { "$gt": 85 } } } }).pretty()
- we use
- what if you want to find all students who got extra credit?:
db.grades.find({ "scores": { "$elemMatch": { "type": "extra credit" } } }).pretty()
- what if you want to find how many offices companies have in seattle?
db.companies.find({"offices": {"$elemMatch": {"city": "Seattle"}}}).count()
- we can use dot notation to get the value of an item in a field:
db.trips.findOne({ "start station location.type": "Point" })
here, start station location.type: "Point" is acccessed through dot notation - dot notation is way is FASTER than $elemMatch
- to query an element in subdocument, use dot notation
- ex: how many trips start at locations that are west of -74 longitude?
- longitude decreases as you move west, and is the 0th element in the array
db.trips.find({"start station location.coordinates.0": {"$lt": -74}}).count()
- longitude decreases as you move west, and is the 0th element in the array
- Aggregation Framework: its another way to query in mongodb
- aggregation syntax:
db.listingsAndReviews.aggregate([ { "$match": { "amenities": "Wifi" } }, { "$project": { "price": 1, "address": 1, "_id": 0 }}]).pretty()
- aggregation FILTERS the data in a pipeline
- $match and $project are filters:
$match
: filters anything that isn't inamenities
$project
: filters fields that are notprice
oraddress
(because they both equal 1, include them) $group
: can find an array of elements in a document ex:$group: { _id: address.country,....}
another ex:db.listingsAndReviews.aggregate([ { "$project": { "address": 1, "_id": 0 }}, { "$group": { "_id": "$address.country" }}])
another ex:db.listingsAndReviews.aggregate([ { "$project": { "address": 1, "_id": 0 }}, { "$group": { "_id": "$address.country", "count": { "$sum": 1 } } } ])
this is saying: we only want the address from each document, group them all together in a new document, then use dot notation as: "_id" : "$address.country", and for each field, ONLY get a different country!!- REMEMBER: this does not modify the data!!
- so basically, we can look for very specific things in the data and output an array of them!!
- $project is WHAT we are looking for
- $group is HOW we print out the data in an array (using the variable operator)
ex: what room types are present in the collection?
use:
db.listingsAndReviews.aggregate([{"$project": {"room_type": 1, "_id": 0}}, {"$group": {"_id": "$room_type"}}]).pretty()
$sort
sorts data from ascending/descending order- syntax:
db.zips.find().sort({<field>: <order>})
- where order is 1, 0, or -1
- syntax:
$limit
specifies HOW MANY results we get- syntax for "$sort" and "$limit":
db.<collection>.find().sort().limit()
- what if we want to find top result?
ex: lets find the city with the least number of people:
db.zips.find().sort({ "pop": 1 }).limit(1)
why is this? it sorts by population from least people to most people"pop": 1
means to look for populations increasing (gives us the first least populated)"pop": 0
gives us ALL zip codes with a population of 0"pop": -1
means search for populations decreasing (gives us the first most populated)- ex:
db.zips.find().sort({ "pop": -1 }).limit(1)
- ex:
- what if we want to find the top 3 or top 10 results?
- for top 10:
db.zips.find().sort({ "pop": -1 }).limit(10)
(gives top 10 results in decreasing order) - for top 3:
db.zips.find().sort({ "pop": -1 }).limit(3)
- for top 10:
- this is FASTER than using $sort!!
- an index is placed alphabetically/numerically
- syntax:
db.trips.createIndex({"birth year": 1})
db.trips.find({"birth year": 1989})
- this sorts birthdays so querying is faster!
- helps make querying efficient BUT only good for single field indexes
- for compound indexes (needs more than one field):
db.trips.find({"start station id": 476).sort("birth year": 1)
- data modeling is a way to organize fields to support your app performance and querying
- REMEMBER: data is stored the way its used!!
- Organizing:
{ "name": "", "age": #, "pref cont": "", "conts": [{},{},{}], "prescriptions": [{},{},{}, {}], "allergies": [], "prior visits": [{}], "next visit": "", "diagnoses": [] }
- Query:
db.patient find({"name": "Cora"})
db.patient.find({"next visit": "12-15"})
db.medication.find({"uses": "flu"})
db.medication.find({"code": 329})
- notice how data modeling is faster and easier to read?!!?!?!
- upsert is a hybrid of update and insert
- syntax:
db.collection.updateOne({<query>}, {<update>}, {"upsert": true})
- if upsert is true, update the matched document, else, insert a new document
another example:
db.iot.updateOne({"sensor": r.sensor, "date": r.date, "valcount": {"$lt": 48}}, {"$push": {"readings": {"v": r.value, "t": r.time}}, { upsert: true}
- $match:
{ amenities: "Wifi" }
--> get all unfiltered documents with amenities including "Wifi" - $project:
{ price: 1, address: 1 }
--> specify which fields to be included/excluded - $group:
{ _id: "$address.country", count: { $sum: 1 }, total_price: { $sum: "$price" } }
--> return _id and field name (we can make up the field names) - $count:
num_countries
---> returns a single field and value of the number of countries aggregated
- a project can have different clusters (ie, different people using the same database)