5 examples of 'drop duplicates pandas' in Python

Every line of 'drop duplicates pandas' code snippets is scanned for vulnerabilities by our powerful machine learning engine that combs millions of open source libraries, ensuring your Python code is secure.

All examples are scanned by Snyk Code

By copying the Snyk Code Snippets you agree to
49def remove_duplicates(df_or_series):
50 """ Remove duplicate rows or values by keeping the first of each duplicate.
51
52 Parameters
53 ----------
54 df_or_series : :any:`pandas.DataFrame` or :any:`pandas.Series`
55 Pandas object from which to drop duplicate index values.
56
57 Returns
58 -------
59 deduplicated : :any:`pandas.DataFrame` or :any:`pandas.Series`
60 The deduplicated pandas object.
61 """
62 # CalTrack 2.3.2.2
63 return df_or_series[~df_or_series.index.duplicated(keep="first")]
14def drop_duplicate_events(df):
15 """
16 Function to group dataframe, use all new information from the latest row
17 but keep the ``event_index`` from the first one
18 """
19 df = df.sort_values('event_index', na_position='last')
20 event_index = df.event_index.iloc[0]
21 r = df.iloc[-1].to_dict()
22 r['event_index'] = event_index
23 return r
39def _drop_col(self, df):
40 '''
41 Drops last column, which was added in the parsing procedure due to a
42 trailing white space for each sample in the text file
43 Arguments:
44 df: pandas dataframe
45 Return:
46 df: original df with last column dropped
47 '''
48 return df.drop(df.columns[-1], axis=1)
259def drop_some(df_: pd.DataFrame, thresh: int) -> pd.DataFrame:
260 # thresh is the minimum number of NA, the 1 indicates that columns should be dropped not rows
261 return df_.dropna(1, thresh=thresh)
80def _clean_columns(df, keep_colnames):
81 new_colnames = []
82 for i,colname in enumerate(df.columns):
83 if colname not in keep_colnames:
84 new_colnames.append(i)
85 else:
86 new_colnames.append(colname)
87 return new_colnames

Related snippets