pandas.DataFrame.nlargest
-
DataFrame.nlargest(n, columns, keep='first')[source] -
Return the first
nrows ordered bycolumnsin descending order.Return the first
nrows with the largest values incolumns, in descending order. The columns that are not specified are returned as well, but not used for ordering.This method is equivalent to
df.sort_values(columns, ascending=False).head(n), but more performant.Parameters: -
n : int -
Number of rows to return.
-
columns : label or list of labels -
Column label(s) to order by.
-
keep : {‘first’, ‘last’, ‘all’}, default ‘first’ -
Where there are duplicate values:
-
first: prioritize the first occurrence(s) -
last: prioritize the last occurrence(s) -
-
all : do not drop any duplicates, even it means - selecting more than
nitems.
-
New in version 0.24.0.
-
Returns: - DataFrame
-
The first
nrows ordered by the given columns in descending order.
See also
-
DataFrame.nsmallest - Return the first
nrows ordered bycolumnsin ascending order. -
DataFrame.sort_values - Sort DataFrame by the values.
-
DataFrame.head - Return the first
nrows without re-ordering.
Notes
This function cannot be used with all column types. For example, when specifying columns with
objectorcategorydtypes,TypeErroris raised.Examples
>>> df = pd.DataFrame({'population': [59000000, 65000000, 434000, ... 434000, 434000, 337000, 11300, ... 11300, 11300], ... 'GDP': [1937894, 2583560 , 12011, 4520, 12128, ... 17036, 182, 38, 311], ... 'alpha-2': ["IT", "FR", "MT", "MV", "BN", ... "IS", "NR", "TV", "AI"]}, ... index=["Italy", "France", "Malta", ... "Maldives", "Brunei", "Iceland", ... "Nauru", "Tuvalu", "Anguilla"]) >>> df population GDP alpha-2 Italy 59000000 1937894 IT France 65000000 2583560 FR Malta 434000 12011 MT Maldives 434000 4520 MV Brunei 434000 12128 BN Iceland 337000 17036 IS Nauru 11300 182 NR Tuvalu 11300 38 TV Anguilla 11300 311 AIIn the following example, we will use
nlargestto select the three rows having the largest values in column “population”.>>> df.nlargest(3, 'population') population GDP alpha-2 France 65000000 2583560 FR Italy 59000000 1937894 IT Malta 434000 12011 MTWhen using
keep='last', ties are resolved in reverse order:>>> df.nlargest(3, 'population', keep='last') population GDP alpha-2 France 65000000 2583560 FR Italy 59000000 1937894 IT Brunei 434000 12128 BNWhen using
keep='all', all duplicate items are maintained:>>> df.nlargest(3, 'population', keep='all') population GDP alpha-2 France 65000000 2583560 FR Italy 59000000 1937894 IT Malta 434000 12011 MT Maldives 434000 4520 MV Brunei 434000 12128 BNTo order by the largest values in column “population” and then “GDP”, we can specify multiple columns like in the next example.
>>> df.nlargest(3, ['population', 'GDP']) population GDP alpha-2 France 65000000 2583560 FR Italy 59000000 1937894 IT Brunei 434000 12128 BN -
© 2008–2012, AQR Capital Management, LLC, Lambda Foundry, Inc. and PyData Development Team
Licensed under the 3-clause BSD License.
https://pandas.pydata.org/pandas-docs/version/0.24.2/reference/api/pandas.DataFrame.nlargest.html