将字符串拆分为向量中给出的多个部分

时间:2018-01-15 02:03:14

标签: r dplyr

R Gurus,我正在努力寻找一种将字符串拆分为向量中给出的多个部分的有效方法。

在下面的示例中,我从BINANCE交换中获得的加密货币对很少。我想将每对分成两个独立的部分,这些部分在alleles数据框的symbol列中给出。

top100

一种天真的方法是将代码的前三个字符设为library(dplyr) library(jsonlite) library(RCurl) top100 <- data.frame(fromJSON(getURL(paste0('https://api.coinmarketcap.com/v1/ticker/?start=0&limit=100')))) markets <- data.frame(pairs = c("NEOBTC","EOSETH","VENETH","ELFETH","ICXETH","BNBETH","NEOETH", "TRXETH","QTUMETH","DASHETH","XRPETH" ,"ETHUSDT","LTCUSDT","ADAETH", "XMRETH","ZECETH","IOTAETH","NEOUSDT","BNBUSDT","XLMBNB","LSKBNB"), symbol1 = NA, symbol2 = NA) markets$symbol1 <- substr(markets$pairs, 1,3) markets$symbol2 <- substr(markets$pairs, 4,6) markets$symbol1 %in% top100$symbol markets$symbol2 %in% top100$symbol ,将最后三个字符设为symbol1,有些代码的字符数超过DASH三个字符。

1 个答案:

答案 0 :(得分:1)

您可以尝试以下代码:

grep("\\w\\s\\w",sapply(paste0("(",top100$symbol,"$)"),
                    sub,"\\3 \\1",a<-markets$pairs),value = T)%>%
                    {.[match(a,sub("\\s","",.))]}%>%
                    strsplit(.,"\\s")%>%do.call(rbind,.)%>%
                    {setNames(as.data.frame(.),paste0("Symbols",1:2))}

您也可以尝试:

sub(paste0("(",top100$symbol,")$",collapse = "|"),"",a<-markets$pairs)%>%
{cbind.data.frame(Symbols1=.,Symbols2=sub(paste0("^(",.,")",collapse = "|"),"",a))}

上述代码均为:

      Symbols1 Symbols2
1       NEO      BTC
2       EOS      ETH
3       VEN      ETH
4       ELF      ETH
5       ICX      ETH
6       BNB      ETH
7       NEO      ETH
8       TRX      ETH
9      QTUM      ETH
10     DASH      ETH
11      XRP      ETH
12      ETH     USDT
13      LTC     USDT
14      ADA      ETH
15      XMR      ETH
16      ZEC      ETH
17     IOTA      ETH
18      NEO     USDT
19      BNB     USDT
20      XLM      BNB
21      LSK      BNB