鍍金池/ 問答/Python/ python處理多行字符串將第一行進(jìn)行修改

python處理多行字符串將第一行進(jìn)行修改

以源代碼從網(wǎng)站 KEGG-API獲取了所需要的文本,其格式如下:

[字符串1]:

b'>hsa:10056 K01890 phenylalanyl-tRNA synthetase beta chain [EC:6.1.1.20] | (RefSeq) FARSB, FARSLB, FRSB, HSPC173, NEDBLLA, PheHB, PheRS; phenylalanyl-tRNA synthetase subunit beta (A)\nMPTVSVKRDLLFQALGRTYTDEEFDELCFEFGLELDEITSEKEIISKEQGNVKAAGASDV\nVLYKIDVPANRYDLLCLEGLVRGLQVFKERIKAPVYKRVMPDGKIQKLIITEETAKIRPF\nAVAAVLRNIKFTKDRYDSFIELQEKLHQNICRKRALVAIGTHDLDTLSGPFTYTAKRPSD\nIKFKPLNKTKEYTACELMNIYKTDNHLKHYLHIIENKPLYPVIYDSNGVVLSMPPIINGD\nHSRITVNTRNIFIECTGTDFTKAKIVLDIIVTMFSEYCENQFTVEAAEVVFPNGKSHTFP\nELAYRKEMVRADLINKKVGIRETPENLAKLLTRMYLKSEVIGDGNQIEIEIPPTRADIIH\nACDIVEDAAIAYGYNNIQMTLPKTYTIANQFPLNKLTELLRHDMAAAGFTEALTFALCSQ\nEDIADKLGVDISATKAVHISNPKTAEFQVARTTLLPGLLKTIAANRKMPLPLKLFEISDI\nVIKDSNTDVGAKNYRHLCAVYYNKNPGFEIIHGLLDRIMQLLDVPPGEDKGGYVIKASEG\nPAFFPGRCAEIFARGQSVGKLGVLHPDVITKFELTMPCSSLEINVGPFL\n'

處理成utf-8格式后:
[字符串2]:

>hsa:10056 K01890 phenylalanyl-tRNA synthetase beta chain [EC:6.1.1.20] | (RefSeq) FARSB, FARSLB, FRSB, HSPC173, NEDBLLA, PheHB, PheRS; phenylalanyl-tRNA synthetase subunit beta (A)
MPTVSVKRDLLFQALGRTYTDEEFDELCFEFGLELDEITSEKEIISKEQGNVKAAGASDV
VLYKIDVPANRYDLLCLEGLVRGLQVFKERIKAPVYKRVMPDGKIQKLIITEETAKIRPF
AVAAVLRNIKFTKDRYDSFIELQEKLHQNICRKRALVAIGTHDLDTLSGPFTYTAKRPSD
IKFKPLNKTKEYTACELMNIYKTDNHLKHYLHIIENKPLYPVIYDSNGVVLSMPPIINGD
HSRITVNTRNIFIECTGTDFTKAKIVLDIIVTMFSEYCENQFTVEAAEVVFPNGKSHTFP
ELAYRKEMVRADLINKKVGIRETPENLAKLLTRMYLKSEVIGDGNQIEIEIPPTRADIIH
ACDIVEDAAIAYGYNNIQMTLPKTYTIANQFPLNKLTELLRHDMAAAGFTEALTFALCSQ
EDIADKLGVDISATKAVHISNPKTAEFQVARTTLLPGLLKTIAANRKMPLPLKLFEISDI
VIKDSNTDVGAKNYRHLCAVYYNKNPGFEIIHGLLDRIMQLLDVPPGEDKGGYVIKASEG
PAFFPGRCAEIFARGQSVGKLGVLHPDVITKFELTMPCSSLEINVGPFL

現(xiàn)在我的目標(biāo)是將第一行空格后的數(shù)據(jù)刪除,其余不修改,完成如下:
[字符串3]:

>hsa:10056
MPTVSVKRDLLFQALGRTYTDEEFDELCFEFGLELDEITSEKEIISKEQGNVKAAGASDV
VLYKIDVPANRYDLLCLEGLVRGLQVFKERIKAPVYKRVMPDGKIQKLIITEETAKIRPF
AVAAVLRNIKFTKDRYDSFIELQEKLHQNICRKRALVAIGTHDLDTLSGPFTYTAKRPSD
IKFKPLNKTKEYTACELMNIYKTDNHLKHYLHIIENKPLYPVIYDSNGVVLSMPPIINGD
HSRITVNTRNIFIECTGTDFTKAKIVLDIIVTMFSEYCENQFTVEAAEVVFPNGKSHTFP
ELAYRKEMVRADLINKKVGIRETPENLAKLLTRMYLKSEVIGDGNQIEIEIPPTRADIIH
ACDIVEDAAIAYGYNNIQMTLPKTYTIANQFPLNKLTELLRHDMAAAGFTEALTFALCSQ
EDIADKLGVDISATKAVHISNPKTAEFQVARTTLLPGLLKTIAANRKMPLPLKLFEISDI
VIKDSNTDVGAKNYRHLCAVYYNKNPGFEIIHGLLDRIMQLLDVPPGEDKGGYVIKASEG
PAFFPGRCAEIFARGQSVGKLGVLHPDVITKFELTMPCSSLEINVGPFL

現(xiàn)在我的問題是:

如何獲取文本后,不保存為為文件,直接對多行字符串進(jìn)行處理,然后再保存為文件?
因?yàn)閷@取的文本寫入文件,然后再去處理這個(gè)文件感覺多此一舉。

這是獲取文本的代碼:

def getHtml(url): #獲取網(wǎng)頁源代碼
    request = urllib.request.Request(url)
    response = urllib.request.urlopen(request)
    return response.read().decode('utf-8')

url1 = "http://rest.kegg.jp/get/hsa:10056/aaseq"
text = getHtml(url1)

其中獲取的‘text’內(nèi)容如上[字符串2]所示
我知道可以使用split切除第一行:

>>>str1 = 'hsa:10056 K01890 phenylalanyl-tRNA synthetase beta chain [EC:6.1.1.20] | (RefSeq) FARSB, FARSLB, FRSB, HSPC173, NEDBLLA, PheHB, PheRS; phenylalanyl-tRNA synthetase subunit beta (A)'
>>>str2 = str1.split(' ')[:1]
>>>print(str2)
['hsa:10056']

但現(xiàn)在問題是,'text'是個(gè)多行字符串,我只要處理它的第一行,不知道如何解決?

回答
編輯回答
醉淸風(fēng)

不知道是不是你想要的
圖片描述

2017年10月4日 00:31
編輯回答
不歸路

你知道換行符是\n的話,應(yīng)該就知道怎么處理了吧。 str1.split("\n")[0].split(" ")[:1]

2017年12月26日 02:06