MongoDB Queries
Connect to MongoDB
Connect via Mongo Shell to the database mongodbqueries on a DBI5 cluster.
mongosh mongodb+srv://5hif:5hif@dbi5.4e2bjau.mongodb.net/ --username 5hif
Password is 5hif
use mongodbqueries
Vienna Museums (collection museum)
1.1) How many museums are listed in the collection?
db.museum.countDocuments()

1.2) Which museums can be found in the 4th district of Vienna (only address information of interest)?
db.museum.find({ "properties.BEZIRK": 4 }, { "properties.ADRESSE": 1, _id: 0 })
- to only show specific fields, use projection (2nd parameter of find)
- set
_id: 0to hide the_idfield (shown by default) - use
1to show a field,0to hide a field

1.3) What are the first three museums sorted with alphabetical order by name? Make sure to list just the names and the ids
db.museum.find({}, { "properties.NAME": 1 }).sort({ "properties.NAME": 1 }).limit(3)
- sorting in ascending order is 1, descending order is -1
limit(n)limits the number of documents returned to n

1.4) Which museums are located in the 13th or 14th district? Sort the result first by district, then by name ascending
db.museum.find({ "properties.BEZIRK": { $in: [13, 14] } }).sort({ "properties.BEZIRK": 1, "properties.NAME": 1 })
or with $or operator:
db.museum.find({ $or: [ { "properties.BEZIRK": 13 }, { "properties.BEZIRK": 14 } ] }).sort({ "properties.BEZIRK": 1, "properties.NAME": 1 })

1.5) How many museums are located outside the 1st district?
db.museum.countDocuments({ "properties.BEZIRK": { $ne: 1 } })
$nematches all values that are not equal to a specified value

1.6) How much is the number of museums outside the inner districts 1 to 9?
db.museum.countDocuments({ "properties.BEZIRK": { $gt: 9 } })

1.7) In which districts do you find specific museums (“Bezirksmuseen”, district numbers ONLY)?
db.museum.distinct("properties.BEZIRK", { "properties.NAME": { $regex: "Bezirksmuseum" } })

1.8) Which museums have an address where the house number is 5?
db.museum.find({ "properties.ADRESSE": { $regex: " 5$" } }, { "properties.NAME": 1, "properties.ADRESSE": 1, _id: 0 })
$regexprovides regular expression capabilities for pattern matching strings in queries$matches the end of a string

Which districts, in ascending order, have 5 or more, and how many, museums?
db.museum.aggregate([
{ $group: { _id: "$properties.BEZIRK", museumCount: { $sum: 1 } } },
{ $match: { museumCount: { $gte: 5 } } },
{ $sort: { _id: 1 } }
])
$groupgroups input documents by the specified _id expression and applies the accumulator expressions, indicates a field)$sumcalculates the sum of numeric values$matchfilters the documents to pass only the documents that match the specified condition(s)$sortsorts the documents based on the specified field(s) (1 for ascending, -1 for descending)

Finding the top 3 districts with the most museums
db.museum.aggregate([
{ $group: { _id: "$properties.BEZIRK", museumCount: { $sum: 1 } } },
{ $sort: { museumCount: -1 }},
{ $limit: 3 }
])
$limitlimits the number of documents returned to n

Twitter (collection tweets)
2.1) How many tweets in the collection origin from users with friends’ count in 80,90?
db.tweets.countDocuments({ "user.friends_count": { $gte: 80, $lte: 90 } })
$gtematches values that are greater than or equal to a specified value$ltematches values that are less than or equal to a specified value
other filter operators:
$inmatches any of the values specified in an array$nematches all values that are not equal to a specified value

2.2) Are there any documents that miss the “source” field?
db.tweets.countDocuments({ "source": { $exists: false } })
$existsmatches documents that have the specified field

2.3) How many tweets have exactly three hashtags?
db.tweets.countDocuments({ "entities.hashtags": { $size: 3 } })
$sizematches any array with the specified number of elements

2.4) Give me the tweets contain a hashtag “love” (case-sensitive)?
db.tweets.find({ "entities.hashtags.text": "love" }, { "text": 1, "entities.hashtags": 1 })

2.5) How many tweets contain a hashtag “love” (case-insensitive)?
db.tweets.countDocuments({ "entities.hashtags.text": { $regex: "love", $options: "i" } })
imakes the search case-insensitive (default is case-sensitive)

2.6) What is the number of tweets per user language in descending order?
db.tweets.aggregate([
{ $group: { _id: "$user.lang", tweetCount: { $sum: 1 } } },
{ $sort: { tweetCount: -1 } }
])

2.7) Extend 2.6 and include the average number of followers and the size in bytes of the longest tweet text per user language
db.tweets.aggregate([
{ $group: { _id: "$user.lang", tweetCount: { $sum: 1 }, avgFollowers: { $avg: "$user.followers_count" },
longestTweetSize: { $max: { $strLenBytes: { $ifNull: ["$text", ""] } } } }},
{ $sort: { tweetCount: -1 } }
])
$avgcalculates the average of numeric values$maxreturns the maximum value$strLenBytesreturns the number of bytes in a string
