News from this site

 Rental advertising space, please contact the webmaster if you need cooperation


+focus
focused

classification  

no classification

tag  

no tag

date  

no datas

How to conditionally separate a cell value and add to a column using pandas

posted on 2024-11-07 20:02     read(545)     comment(0)     like(19)     collect(4)


For example

testing.csv:

First Name    Last Name  Profile URL
Ashleigh      Phelps     https://www.linkedin.com/in/ashleighephelps
Jonathan                 https://www.linkedin.com/in/jonathantsegal
Camilla Innes            https://www.linkedin.com/in/camilla-innes-61213628  
Rachel                   https://www.linkedin.com/in/rachel-hudesman-335b8120
Michael                  https://www.linkedin.com/in/mikeitalia
Antonio                  https://www.linkedin.com/in/antoniomolinelli
Lauren        Zsigray    https://www.linkedin.com/in/lauren-zsigray-13b5aa25

The code I have used will only separate which has a hyphen but how to get the last name which is with the first name?

df = pd.read_csv("testing.csv", sep=',', encoding="utf-8")
df = df[df['Last Name'].isnull()]
p = df.pop('Profile URL')
tmp_df = p.str.split('/')
df['Last Name'] = tmp_df.str[-1]
tmp1_df = df.pop('Last Name').str.split('-')
df['Last Name'] = tmp1_df.str[1:-1].str.join(sep='-')
df = pd.concat([df, p], axis=1)
print (df)

Which gives this output:

First Name  Last Name       Profile URL
Ashleigh    Phelps          https://www.linkedin.com/in/ashleighephelps
Jonathan                    https://www.linkedin.com/in/jonathantsegal
Camilla     Innes           https://www.linkedin.com/in/camilla-innes-61213628
Rachel      hudesman        https://www.linkedin.com/in/rachel-hudesman-335b8120
Michael                     https://www.linkedin.com/in/mikeitalia
Antonio                     https://www.linkedin.com/in/antoniomolinelli
Lauren      Zsigray         https://www.linkedin.com/in/lauren-zsigray-13b5aa25

Expected output:

First Name  Last Name       Profile URL
Ashleigh    Phelps          https://www.linkedin.com/in/ashleighephelps
Jonathan    tsegal          https://www.linkedin.com/in/jonathantsegal
Camilla     Innes           https://www.linkedin.com/in/camilla-innes-13628
Rachel      hudesman        https://www.linkedin.com/in/rachel-hudesman-33
Michael                     https://www.linkedin.com/in/mikeitalia
Antonio     molinelli       https://www.linkedin.com/in/antoniomolinelli
Lauren      Zsigray         https://www.linkedin.com/in/lauren-zsigray-13b5a  

What should be used to get the output in this format


solution


Try this piece of code:

import pandas as pd

df = pd.read_csv("testing.csv", sep=',', encoding="utf-8")

df.fillna('', inplace=True)

def clear_data(x):
    fname = x['First Name']
    lname = x['Last Name'].strip()
    url = x['Profile URL']
    if not lname:
        fname = fname.split(' ')[0]
        url_name = url.split('/')[-1].split('-')
        if len(url_name) > 1:
            lname = url_name[-2].title()
        else:
            index_of_fname = url_name[0].lower().find(fname.lower())
            if index_of_fname != -1:
                index_of_fname += len(fname)
                lname = url_name[0][index_of_fname:].title()

        x['First Name'] = fname
        x['Last Name'] = lname
    else:
        lname = lname.split('-')[0].strip()
        x['Last Name'] = lname

    return x


df.apply(clear_data, axis=1)

print(df)


Category of website: technical article > Q&A

Author:qs

link:http://www.pythonblackhole.com/blog/article/246856/cd3c71a6a69658d78746/

source:python black hole net

Please indicate the source for any form of reprinting. If any infringement is discovered, it will be held legally responsible.

19 0
collect article
collected

Comment content: (supports up to 255 characters)