re.sub "Beklenen dize veya bayt benzeri nesne&quot ile hata veriyor;

Question

Daha

Question

re.sub "Beklenen dize veya bayt benzeri nesne&quot ile hata veriyor;

Bu hatayla ilgili birden fazla yazı okudum, ancak hala anlayamıyorum. Fonksiyonum boyunca döngü yapmaya çalıştığımda:

def fix_Plan(location):
    letters_only = re.sub("[^a-zA-Z]",  # Search for all non-letters
                          " ",          # Replace all non-letters with spaces
                          location)     # Column and row to search    

    words = letters_only.lower().split()     
    stops = set(stopwords.words("english"))      
    meaningful_words = [w for w in words if not w in stops]      
    return (" ".join(meaningful_words))    

col_Plan = fix_Plan(train["Plan"][0])    
num_responses = train["Plan"].size    
clean_Plan_responses = []

for i in range(0,num_responses):
    clean_Plan_responses.append(fix_Plan(train["Plan"][i]))

İşte hata:

Traceback (most recent call last):
  File "C:/Users/xxxxx/PycharmProjects/tronc/tronc2.py", line 48, in <module>
    clean_Plan_responses.append(fix_Plan(train["Plan"][i]))
  File "C:/Users/xxxxx/PycharmProjects/tronc/tronc2.py", line 22, in fix_Plan
    location)  # Column and row to search
  File "C:\Users\xxxxx\AppData\Local\Programs\Python\Python36\lib\re.py", line 191, in sub
    return _compile(pattern, flags).sub(repl, string, count)
TypeError: expected string or bytes-like object

smci

Edited question 2018ööp38öö8 Aralık 2018 в 8:38

Programlama

regex

python

pandas

nltk

Solution / Answer

Bilal Chandio

2019ösp46ös12 Ekim 2019 в 12:46

Daha

Sanırım re.match() fonksiyonunu kullanmak daha iyi olacaktır. işte size yardımcı olabilecek bir örnek.

import re
import nltk
from nltk.tokenize import word_tokenize
nltk.download('punkt')
sentences = word_tokenize("I love to learn NLP \n 'a :(")
#for i in range(len(sentences)):
sentences = [word.lower() for word in sentences if re.match('^[a-zA-Z]+', word)]  
sentences

0

Related communities 1

Python Türkiye

29 users

Python ilgililerinin Türkiye topluluğudur. Discord => https://discord.gg/2FdmXUE @JavaScriptTR @ReactTR @VuejsTR @NodeTR

Open telegram

Soru ekleyin

Kategoriler

Herşey

Teknoloji

Kültür / Rekreasyon

Yaşam / Sanat

Bilim

Profesyonel

İş Dünyası

Kullanıcılar

All

New

Popular

1

Ксения Комарова

Registered 3 hafta önce

2

3

4

5

Do you have a question? Add it on the site and get an answer instantly

en.kzen.dev

abccd · Accepted Answer · 2017-05-01T23:08:27+00:00

Yorumlarda belirttiğiniz gibi, bazı değerler string değil float olarak görünmektedir. Bunu re.suba aktarmadan önce string olarak değiştirmeniz gerekecektir. En basit yol re.sub kullanırken location değerini str(location) olarak değiştirmektir. Zaten bir str olsa bile bunu yapmaktan zarar gelmez.

letters_only = re.sub("[^a-zA-Z]",  # Search for all non-letters
                          " ",          # Replace all non-letters with spaces
                          str(location))