$fill (aggregation)
Definition
- $fill
- New in version 5.3. - Populates - nulland missing field values within documents.- You can use - $fillto populate missing data points:- In a sequence based on surrounding values. 
- With a fixed value. 
 
Syntax
The $fill stage has this syntax:
{    $fill: {       partitionBy: <expression>,       partitionByFields: [ <field 1>, <field 2>, ... , <field n> ],       sortBy: {          <sort field 1>: <sort order>,          <sort field 2>: <sort order>,          ...,          <sort field n>: <sort order>       },       output: {          <field 1>: { value: <expression> },          <field 2>: { method: <string> },          ...       }    } } 
The $fill stage takes a document with these fields:
| Field | Necessity | Description | 
|---|---|---|
| Optional | Specifies an expression to group
the documents. In the  If you omit partitionBy and
partitionByFields,
 partitionBy and partitionByFields are mutually exclusive. See an example. | |
| Optional | Specifies an array of fields as the compound key to group the
documents. In the  If you omit partitionBy and
partitionByFields,
 partitionBy and partitionByFields are mutually exclusive. | |
| Required if method is specified in at least one output.<field>. Otherwise, optional. | Specifies the field or fields to sort the documents within each
partition. Uses the same syntax as the  | |
| Required | Specifies an object containing each field for which to fill missing values. You can specify multiple fields in the output object. The object name is the name of the field to fill. The object value specifies how the field is filled. | |
| Required | Specifies an object indicating how to fill missing values in the target field. The object name must be either  
 | 
Behavior and Restrictions
partitionByFields Restrictions
$fill returns an error if any field name in the
partitionByFields array:
- Evaluates to a non-string value. 
- Begins with - $.
linear Behavior
The linear fill method fills null and missing fields using
linear interpolation based on the surrounding non-null values in
the sequence.
- For each document where the field is - nullor missing,- linearFillfills those fields in proportion to the missing value range between surrounding non-- nullvalues according to the sortBy order. To determine the values for missing fields,- linearFilluses:- The difference of surrounding non- - nullvalues.
- The number of - nullfields to fill between the surrounding values.
 
- The - linearmethod can fill multiple consecutive- nullvalues if those values are preceded and followed by non-- nullvalues according to the sortBy order.- Example- If a collection contains these documents: - { index: 0, value: 0 }, - { index: 1, value: null }, - { index: 2, value: null }, - { index: 3, value: null }, - { index: 4, value: 10 } - After using the - linearfill method to fill the- nullvalues, the documents become:- { index: 0, value: 0 }, - { index: 1, value: 2.5 }, - { index: 2, value: 5 }, - { index: 3, value: 7.5 }, - { index: 4, value: 10 } 
- nullvalues that are not preceded and followed by non-- nullvalues remain- null.
- To use the - linearfill method, you must also use the sortBy field to sort your data.
For a complete example using the linear fill method, see
Fill Missing Field Values with Linear Interpolation.
locf Behavior
locf stands for last observation carried forward.
- If a field being filled contains both - nulland non-null values,- locfsets the- nulland missing values to the field's last known non-null value according to the sortBy order.- If the field contains only - nullor missing values in a partition,- locfsets the field value to- nullfor that partition.
- nulland missing field values that appear before non-null values in the sort order remain- null.
 
- To use the - locffill method, you must also use the sortBy field to sort your data.
For a complete example using the locf fill method, see
Fill Missing Field Values Based on the Last Observed Value.
Comparison of $fill and Aggregation Operators
To fill null and missing field values within a document you can use:
- The - $fillstage.- When you use the - $fillstage, the field you specify in the output is the same field used as the source data.
- The - $linearFilland- $locfaggregation operators.- When you - $linearFillor- $locf, you can set values for a different field than the field used as the source data.
Examples
The examples in this section show how to use $fill to
fill missing values:
Fill Missing Field Values with a Constant Value
A shoe store maintains a dailySales collection that contains a
document summarizing each day's sales. The shoe store sells these types
of shoes:
- boots
- sandals
- sneakers
Create the following dailySales collection:
db.dailySales.insertMany( [    {       "date": ISODate("2022-02-02"),       "bootsSold": 10,       "sandalsSold": 20,       "sneakersSold": 12    },    {       "date": ISODate("2022-02-03"),       "bootsSold": 7,       "sneakersSold": 18    },    {       "date": ISODate("2022-02-04"),       "sneakersSold": 5    } ] ) 
Not all of the documents in the dailySales collection contain each
shoe type. If a shoe type is missing, it means there were no shoes of
that type sold on the corresponding date.
The following example uses $fill to set the quantities sold
to 0 for the missing shoe types for each day's sales:
db.dailySales.aggregate( [    {       $fill:          {             output:                {                   "bootsSold": { value: 0 },                   "sandalsSold": { value: 0 },                   "sneakersSold": { value: 0 }                }          }    } ] ) 
In the preceding pipeline:
- $fillfills in values for missing fields.
- output specifies: - The names of the fields to fill in. 
- The value to set the filled in fields to. In this example, the output specifies a constant value of - 0.
 
Example output:
[    {      _id: ObjectId("6202df9f394d47411658b51e"),      date: ISODate("2022-02-02T00:00:00.000Z"),      bootsSold: 10,      sandalsSold: 20,      sneakersSold: 12    },    {      _id: ObjectId("6202df9f394d47411658b51f"),      date: ISODate("2022-02-03T00:00:00.000Z"),      bootsSold: 7,      sneakersSold: 18,      sandalsSold: 0    },    {      _id: ObjectId("6202df9f394d47411658b520"),      date: ISODate("2022-02-04T00:00:00.000Z"),      sneakersSold: 5,      bootsSold: 0,      sandalsSold: 0    } ] 
Fill Missing Field Values with Linear Interpolation
Create a stock collection that contains tracks a single company's
stock price at hourly intervals:
db.stock.insertMany( [    {       time: ISODate("2021-03-08T09:00:00.000Z"),       price: 500    },    {       time: ISODate("2021-03-08T10:00:00.000Z"),    },    {       time: ISODate("2021-03-08T11:00:00.000Z"),       price: 515    },    {       time: ISODate("2021-03-08T12:00:00.000Z")    },    {       time: ISODate("2021-03-08T13:00:00.000Z")    },    {       time: ISODate("2021-03-08T14:00:00.000Z"),       price: 485    } ] ) 
The price field is missing for some of the documents in the
collection.
To populate the missing price values using linear interpolation,
use $fill with the linear fill method:
db.stock.aggregate( [    {       $fill:          {             sortBy: { time: 1 },             output:                {                   "price": { method: "linear" }                }          }    } ] ) 
In the preceding pipeline:
- $fillfills in values for missing fields.
- sortBy: { time: 1 }sorts the documents by the- timefield in ascending order, from earliest to latest.
- output specifies: - priceas the field for which to fill in missing values.
- { method: "linear" }as the fill method. The- linearfill method fills missing- pricevalues using linear interpolation based on the surrounding- pricevalues in the sequence.
 
Example output:
[    {       _id: ObjectId("620ad41c394d47411658b5e9"),       time: ISODate("2021-03-08T09:00:00.000Z"),       price: 500    },    {       _id: ObjectId("620ad41c394d47411658b5ea"),       time: ISODate("2021-03-08T10:00:00.000Z"),       price: 507.5    },    {       _id: ObjectId("620ad41c394d47411658b5eb"),       time: ISODate("2021-03-08T11:00:00.000Z"),       price: 515    },    {       _id: ObjectId("620ad41c394d47411658b5ec"),       time: ISODate("2021-03-08T12:00:00.000Z"),       price: 505    },    {       _id: ObjectId("620ad41c394d47411658b5ed"),       time: ISODate("2021-03-08T13:00:00.000Z"),       price: 495    },    {       _id: ObjectId("620ad41c394d47411658b5ee"),       time: ISODate("2021-03-08T14:00:00.000Z"),       price: 485    } ] 
Fill Missing Field Values Based on the Last Observed Value
Create a restaurantReviews collection that contains review scores
for a single restaurant over time:
db.restaurantReviews.insertMany( [    {       date: ISODate("2021-03-08"),       score: 90    },    {       date: ISODate("2021-03-09"),       score: 92    },    {       date: ISODate("2021-03-10")    },    {       date: ISODate("2021-03-11")    },    {       date: ISODate("2021-03-12"),       score: 85    },    {       date: ISODate("2021-03-13")    } ] ) 
The score field is missing for some of the documents in the
collection.
To populate the missing score fields and ensure that there are no
gaps in the data, use $fill. In the following example,
$fill uses the locf fill method to fill the missing
score values with the previous score in the sequence:
db.restaurantReviews.aggregate( [    {       $fill:          {             sortBy: { date: 1 },             output:                {                   "score": { method: "locf" }                }          }    } ] ) 
In the preceding pipeline:
- $fillfills in missing- scorevalues.
- sortBy: { date: 1 }sorts the documents by the- datefield in ascending order, from earliest to latest.
- output specifies: - scoreas the field for which to fill in missing values.
- { method: "locf" }as the fill method. The- locffill method fills missing- scorevalues with the last observed- scorein the sequence.
 
Example output:
[    {      _id: ObjectId("62040bc9394d47411658b553"),      date: ISODate("2021-03-08T00:00:00.000Z"),      score: 90    },    {      _id: ObjectId("62040bc9394d47411658b554"),      date: ISODate("2021-03-09T00:00:00.000Z"),      score: 92    },    {      _id: ObjectId("62040bc9394d47411658b555"),      date: ISODate("2021-03-10T00:00:00.000Z"),      score: 92    },    {      _id: ObjectId("62040bc9394d47411658b556"),      date: ISODate("2021-03-11T00:00:00.000Z"),      score: 92    },    {      _id: ObjectId("62040bc9394d47411658b557"),      date: ISODate("2021-03-12T00:00:00.000Z"),      score: 85    },    {      _id: ObjectId("62040bc9394d47411658b558"),      date: ISODate("2021-03-13T00:00:00.000Z"),      score: 85    } ] 
Fill Data for Distinct Partitions
Consider the previous example with restaurant reviews but instead of tracking a single restaurant, the collection now contains reviews for multiple restaurants.
Create a collection named restaurantReviewsMultiple and populate the
collection with these documents:
db.restaurantReviewsMultiple.insertMany( [    {       date: ISODate("2021-03-08"),       restaurant: "Joe's Pizza",       score: 90    },    {       date: ISODate("2021-03-08"),       restaurant: "Sally's Deli",       score: 75    },    {       date: ISODate("2021-03-09"),       restaurant: "Joe's Pizza",       score: 92    },    {       date: ISODate("2021-03-09"),       restaurant: "Sally's Deli"    },    {       date: ISODate("2021-03-10"),       restaurant: "Joe's Pizza"    },    {       date: ISODate("2021-03-10"),       restaurant: "Sally's Deli",       score: 68    },    {       date: ISODate("2021-03-11"),       restaurant: "Joe's Pizza",       score: 93    },    {       date: ISODate("2021-03-11"),       restaurant: "Sally's Deli"    } ] ) 
The score field is missing for some of the documents in the
collection.
To populate the missing score fields and ensure that there are no
gaps in the data, use $fill. In the following example,
$fill uses the locf fill method to fill the missing
score values with the previous score in the sequence:
db.restaurantReviewsMultiple.aggregate( [    {       $fill:          {             sortBy: { date: 1 },             partitionBy: { "restaurant": "$restaurant" },             output:                {                   "score": { method: "locf" }                }          }    } ] ) 
In the preceding pipeline:
- $fillfills in missing- scorevalues.
- sortBy: { date: 1 }sorts the documents by the- datefield in ascending order, from earliest to latest.
- partitionBy: { "restaurant": "$restaurant" }partitions the data by- restaurant. There are two restaurants:- Joe's Pizzaand- Sally's Deli.
- output specifies: - scoreas the field for which to fill in missing values.
- { method: "locf" }as the fill method. The- locffill method fills missing- scorevalues with the last observed- scorein the sequence.
 
Example output:
[   {     _id: ObjectId("620559f4394d47411658b58f"),     date: ISODate("2021-03-08T00:00:00.000Z"),     restaurant: "Joe's Pizza",     score: 90   },   {     _id: ObjectId("620559f4394d47411658b591"),     date: ISODate("2021-03-09T00:00:00.000Z"),     restaurant: "Joe's Pizza",     score: 92   },   {     _id: ObjectId("620559f4394d47411658b593"),     date: ISODate("2021-03-10T00:00:00.000Z"),     restaurant: "Joe's Pizza",     score: 92   },   {     _id: ObjectId("620559f4394d47411658b595"),     date: ISODate("2021-03-11T00:00:00.000Z"),     restaurant: "Joe's Pizza",     score: 93   },   {     _id: ObjectId("620559f4394d47411658b590"),     date: ISODate("2021-03-08T00:00:00.000Z"),     restaurant: "Sally's Deli",     score: 75   },   {     _id: ObjectId("620559f4394d47411658b592"),     date: ISODate("2021-03-09T00:00:00.000Z"),     restaurant: "Sally's Deli",     score: 75   },   {     _id: ObjectId("620559f4394d47411658b594"),     date: ISODate("2021-03-10T00:00:00.000Z"),     restaurant: "Sally's Deli",     score: 68   },   {     _id: ObjectId("620559f4394d47411658b596"),     date: ISODate("2021-03-11T00:00:00.000Z"),     restaurant: "Sally's Deli",     score: 68   } ] 
Indicate if a Field was Populated Using $fill
When you populate missing values, the output does not indicate if a
value was populated with the $fill operator or if the value existed
in the document originally. To distinguish between filled and
preexisting values, you can use a $set stage before
$fill and set a new field based on whether the value exists.
For example, create a restaurantReviews collection that contains
review scores for a restaurant over time:
db.restaurantReviews.insertMany( [    {       date: ISODate("2021-03-08"),       score: 90    },    {       date: ISODate("2021-03-09"),       score: 92    },    {       date: ISODate("2021-03-10")    },    {       date: ISODate("2021-03-11")    },    {       date: ISODate("2021-03-12"),       score: 85    },    {       date: ISODate("2021-03-13")    } ] ) 
The score field is missing for some of the documents in the
collection. You can populate missing score values with the $fill
operator.
Create a pipeline to perform the following actions:
- Add a new field to each document (using - $set) indicating if the document's- scorefield exists prior to the- $filloperator populating values. This new field is called- valueExisted.
- Populate missing - scorevalues with the last observed- scorein the sequence. The fill method- locfstands for "last observation carried forward".
The pipeline looks like this:
db.restaurantReviews.aggregate( [    {       $set: {          "valueExisted": {             "$ifNull": [                { "$toBool": { "$toString": "$score" } },                false             ]          }       }    },    {       $fill: {          sortBy: { date: 1 },          output:             {                "score": { method: "locf" }             }       }    } ] ) 
Note
Handling Values of Zero
Output:
[    {       _id: ObjectId("63595116b1fac2ee2e957f15"),       date: ISODate("2021-03-08T00:00:00.000Z"),       score: 90,       valueExisted: true    },    {       _id: ObjectId("63595116b1fac2ee2e957f16"),       date: ISODate("2021-03-09T00:00:00.000Z"),       score: 92,       valueExisted: true    },    {       _id: ObjectId("63595116b1fac2ee2e957f17"),       date: ISODate("2021-03-10T00:00:00.000Z"),       valueExisted: false,       score: 92    },    {       _id: ObjectId("63595116b1fac2ee2e957f18"),       date: ISODate("2021-03-11T00:00:00.000Z"),       valueExisted: false,       score: 92    },    {       _id: ObjectId("63595116b1fac2ee2e957f19"),       date: ISODate("2021-03-12T00:00:00.000Z"),       score: 85,       valueExisted: true    },    {       _id: ObjectId("63595116b1fac2ee2e957f1a"),       date: ISODate("2021-03-13T00:00:00.000Z"),       valueExisted: false,       score: 85    } ]