从a中删除前导和尾随字符

时间:2017-05-12 06:41:26

标签: javascript string algorithm

我有一个文本文件,其中的字符串由空格分隔。文本文件包含一些特殊字符(拉丁语,货币,标点符号等),需要从最终输出中丢弃。请注意,除了这些特殊字符外,合法字符都是Unicode中的字符。

我们需要用空格分隔/拆分文本,然后只删除前导和尾随特殊字符。如果特殊字符位于两个合法字符之间,那么我们不会删除它们。

我可以分两个阶段轻松完成。按空格分割文本,然后仅从每个字符串中删除前导和尾随特殊字符。但是,我只需要处理一次字符串。有什么办法,可以一次完成。注意:我们不能使用RegEx。 对于这个问题,假设这些字符是特殊的:

[: , ! . < ; '  "  >  [ ] { }  `  ~ = + - ? / ]

示例:

:!/,.<;:.?;,BBM!/,.<;:.?;,` IS TALKING TO `B!?AM!/,.<;:.?;,

此处输出将是一个有效字符串数组:["BBM", "IS", "TALKING", "TO", "B!?AM"]

2 个答案:

答案 0 :(得分:0)

  • 制作简单状态机(有限自动机)
  • 循环遍历所有字符
  • 在每一步检查当前字符是字母,空格还是特殊
  • 执行一些操作(可能为空),具体取决于状态和字符类型
  • 根据需要更改状态
例如,你可以留在&#34;特别&#34;直到满足信件为止。记住开始单词的索引并在单词&#34;中创建状态&#34;。继续,直到满足特殊字符或空格(您的问题仍然不清楚)。

答案 1 :(得分:0)

我已经使用过打字稿并且已经完成了一次。 请注意,isSpecialCharacterCode(charCode)函数只是检查文本字符的unicode是否与提供的特殊字符的unicode相同。对于isWhitespaceCode(charCode)函数,该名称为true。

&#13;
&#13;
  

parseText(text: string): string[]{

    let words : string[] = [];
    let word = "";
    let charCode = 1;

    let haveSeenLegalChar = false; //set it if we have encountered legal character in text

    let seenSpecialCharsToInclude = false; //set it if we have encountered //special character in text

    let inBetweenSpecialChars = ""; // string containing special chars //which may be included in between legal word

    for(let index = 0; index < text.length; index++){

        charCode = text.charCodeAt(index);
        let isSpecialChar = isSpecialCharacterCode(charCode);
        let isWhitespace = isWhitespaceCode(charCode);
        if(isSpecialChar && !isWhitespace){
            //if this is a special character then two cases
            //first is: It can be part of word (it is only possible if we have already seen atleast one legal character)
            //Since it can be part of word but we are not sure whether this will be part of word so store it for now
            //second is: This is either leading or trailing special character..we should not include these in word
            if(haveSeenLegalChar){
                inBetweenSpecialChars += text[index];
                seenSpecialCharsToInclude = true;
            }else{
                //since we have not seen any legal character till now so it must be either leading or trailing special chars
                seenSpecialCharsToInclude = false;
                inBetweenSpecialChars = "";
            }
        }else if(isWhitespace){
            //we have encountered a whitespace.This is either beginning of word or ending of word.
            //if we have encountered any leagl char, push word into array
            if(haveSeenLegalChar){
                words.push(word);
                word = "";
                inBetweenSpecialChars = "";
            }
            haveSeenLegalChar = false;
        }else if(!isSpecialChar){
            //legal character case
            haveSeenLegalChar = true;
            if(seenSpecialCharsToInclude){
                word += inBetweenSpecialChars;
                seenSpecialCharsToInclude = false;
                inBetweenSpecialChars = "";
            }
            word += text[index];
        }
    }
    return words;
}

  
&#13;
&#13;
&#13;