5 examples of 'df.drop_duplicates' in Python

Every line of 'df.drop_duplicates' code snippets is scanned for vulnerabilities by our powerful machine learning engine that combs millions of open source libraries, ensuring your Python code is secure.

All examples are scanned by Snyk Code

By copying the Snyk Code Snippets you agree to
this disclaimer
49def remove_duplicates(df_or_series):
50 """ Remove duplicate rows or values by keeping the first of each duplicate.
51
52 Parameters
53 ----------
54 df_or_series : :any:`pandas.DataFrame` or :any:`pandas.Series`
55 Pandas object from which to drop duplicate index values.
56
57 Returns
58 -------
59 deduplicated : :any:`pandas.DataFrame` or :any:`pandas.Series`
60 The deduplicated pandas object.
61 """
62 # CalTrack 2.3.2.2
63 return df_or_series[~df_or_series.index.duplicated(keep="first")]
Important

Use secure code every time

Secure your code as it's written. Use Snyk Code to scan source code in minutes – no build needed – and fix issues immediately. Enable Snyk Code

673def _unique(df, columns=None):
674 if isinstance(columns, str):
675 columns = [columns]
676 if not columns:
677 columns = df.columns.tolist()
678 info = {}
679 for col in columns:
680 values = df[col].dropna().values
681 uniques = np.unique(list(_flatten_list(values))).tolist()
682 info[col] = {'count': len(uniques), 'values': uniques}
683 return info
14def drop_duplicate_events(df):
15 """
16 Function to group dataframe, use all new information from the latest row
17 but keep the ``event_index`` from the first one
18 """
19 df = df.sort_values('event_index', na_position='last')
20 event_index = df.event_index.iloc[0]
21 r = df.iloc[-1].to_dict()
22 r['event_index'] = event_index
23 return r
39def _drop_col(self, df):
40 '''
41 Drops last column, which was added in the parsing procedure due to a
42 trailing white space for each sample in the text file
43 Arguments:
44 df: pandas dataframe
45 Return:
46 df: original df with last column dropped
47 '''
48 return df.drop(df.columns[-1], axis=1)
259def drop_some(df_: pd.DataFrame, thresh: int) -> pd.DataFrame:
260 # thresh is the minimum number of NA, the 1 indicates that columns should be dropped not rows
261 return df_.dropna(1, thresh=thresh)

Related snippets