在字符流中查找第一个非重复字符

时间:2014-07-10 09:28:28

标签: c string algorithm

这个问题是以下问题的扩展:

  

Find the first un-repeated character in a string

我的问题是如何做到这一点,如果不是字符串,我们有一个正在运行的字符流。这种方法是否会使用两个数组' chars'和'访问过'正如this solution中所讨论的那样,对这个问题也很好吗?

修改:示例输入 - 3333111124 这里2是第一个非重复字符,假设查询findCharacter()是在流中看到的最后一个字符是' 4'时生成的。

注意:流不会存储在任何地方(与上面的字符串已在内存中可用的问题形成对比),我们只能访问流中最后看到的字符。

5 个答案:

答案 0 :(得分:1)

这个想法是使用DLL(双链表)来有效地从流中获取第一个非重复字符。 DLL按顺序包含所有非重复字符,即DLL的头部包含第一个非重复字符,第二个节点包含第二个非重复字符,依此类推。

我们还维护两个数组:一个数组用于维护已经访问过两次或多次的字符,我们称之为重复[],另一个数组是指向链表节点的指针数组,我们称之为inDLL []。两个数组的大小等于字母大小,通常为256。

1) Create an empty DLL. Also create two arrays inDLL[] and repeated[] of size 256. 
   inDLL is an array of pointers to DLL nodes. repeated[] is a boolean array, 
   repeated[x] is true if x is repeated two or more times, otherwise false. 
   inDLL[x] contains pointer to a DLL node if character x is present in DLL, 
   otherwise NULL. 

2) Initialize all entries of inDLL[] as NULL and repeated[] as false.

3) To get the first non-repeating character, return character at head of DLL.

4) Following are steps to process a new character 'x' in stream.
  a) If repeated[x] is true, ignore this character (x is already repeated two
      or more times in the stream) 
  b) If repeated[x] is false and inDLL[x] is NULL (x is seen first time)
     Append x to DLL and store address of new DLL node in inDLL[x].
  c) If repeated[x] is false and inDLL[x] is not NULL (x is seen second time)
     Get DLL node of x using inDLL[x] and remove the node. Also, mark inDLL[x] 
     as NULL and repeated[x] as true.

请注意,如果我们维护尾指针,则将新节点附加到DLL是O(1)操作。从DLL中删除节点也是O(1)。因此,操作,添加新字符和查找第一个非重复字符都需要O(1) time

以下是 C ++

中的代码
// A C++ program to find first non-repeating character from a stream of characters
#include <iostream>
#define MAX_CHAR 256
using namespace std;

// A linked list node
struct node
{
    char a;
    struct node *next, *prev;
};

// A utility function to append a character x at the end of DLL.
// Note that the function may change head and tail pointers, that
// is why pointers to these pointers are passed.
void appendNode(struct node **head_ref, struct node **tail_ref, char x)
{
    struct node *temp = new node;
    temp->a = x;
    temp->prev = temp->next = NULL;

    if (*head_ref == NULL)
    {
        *head_ref = *tail_ref = temp;
        return;
    }
    (*tail_ref)->next = temp;
    temp->prev = *tail_ref;
    *tail_ref = temp;
}

// A utility function to remove a node 'temp' fromt DLL. Note that the
// function may change head and tail pointers, that is why pointers to
// these pointers are passed.
void removeNode(struct node **head_ref, struct node **tail_ref,
                struct node *temp)
{
    if (*head_ref == NULL)
        return;

    if (*head_ref == temp)
        *head_ref = (*head_ref)->next;
    if (*tail_ref == temp)
        *tail_ref = (*tail_ref)->prev;
    if (temp->next != NULL)
        temp->next->prev = temp->prev;
    if (temp->prev != NULL)
        temp->prev->next = temp->next;

    delete(temp);
}

void findFirstNonRepeating()
{
    // inDLL[x] contains pointer to a DLL node if x is present in DLL.
    // If x is not present, then inDLL[x] is NULL
    struct node *inDLL[MAX_CHAR];

    // repeated[x] is true if x is repeated two or more times. If x is
    // not seen so far or x is seen only once. then repeated[x] is false
    bool repeated[MAX_CHAR];

    // Initialize the above two arrays
    struct node *head = NULL, *tail = NULL;
    for (int i = 0; i < MAX_CHAR; i++)
    {
        inDLL[i] = NULL;
        repeated[i] = false;
    }

    // Let us consider following stream and see the process
    char stream[] = "3333111124";
    for (int i = 0; stream[i]; i++)
    {
        char x = stream[i];
        cout << "Reading " << x << " from stream \n";

        // We process this character only if it has not occurred or occurred
        // only once. repeated[x] is true if x is repeated twice or more.s
        if (!repeated[x])
        {
            // If the character is not in DLL, then add this at the end of DLL.
            if (inDLL[x] == NULL)
            {
                appendNode(&head, &tail, stream[i]);
                inDLL[x] = tail;
            }
            else // Otherwise remove this caharacter from DLL
            {
                removeNode(&head, &tail, inDLL[x]);
                inDLL[x] = NULL;
                repeated[x] = true;  // Also mark it as repeated
            }
        }

        // Print the current first non-repeating character from stream
        if (head != NULL)
            cout << "First non-repeating character so far is " << head->a << endl;
    }
}

/* Driver program to test above function */
int main()
{
    findFirstNonRepeating();
    return 0;
}

输出将如下:

Reading 3 from stream 
First non-repeating character so far is 3
Reading 3 from stream 
Reading 3 from stream 
Reading 3 from stream 
Reading 1 from stream 
First non-repeating character so far is 1
Reading 1 from stream 
Reading 1 from stream 
Reading 1 from stream 
Reading 2 from stream 
First non-repeating character so far is 2
Reading 4 from stream 
First non-repeating character so far is 2

答案 1 :(得分:0)

尝试使用o(n)

中的hashmap

int main()

{

int a[256]={0};

char *b="Helloworldd";

int i;

for(i=0;i<strlen(b);++i)

a[b[i]]++;

for(i=0;i<strlen(b);++i)

if(a[b[i]]>;1)

{

        printf("First Repeating %c\n",b[i]);

        break;
}

for(i=strlen(b)-1;i;--i)

if(a[b[i]]>;1)

{

        printf("Last Repeating %c\n",b[i]);

        break;
}

}

答案 2 :(得分:0)

这里不需要两个数组。完整的代码,我对这个问题的理解

int main(){
    bool * visited = new bool[256];
    memset(visited,false,256); //initialize to false
    char character;
    char answer = 'a'; #random initialization
    while(1){
        cin>>character;
        if (!visited['character']) answer = character; //first non-repeating character at that instance
        visited['character'] = True;
        //Now embed your find character query according to your need
        //For example:
        if ('character' == '4') {
                cout<<answer<<endl;
                return 0;
         }
    }
    return 0;
}

答案 3 :(得分:0)

这是使用BitSet(而不是布尔数组)和rxJava

的Java实现
import java.util.BitSet;
import rx.Observable;
import rx.functions.Action1;

public class CharacterStream {

private static int  MAX_CHAR =  256;

private Node<Character> head;
private Node<Character> tail;

private BitSet repeated;

private Node<Character>[] inDLL;

@SuppressWarnings("unchecked")
public CharacterStream(final Observable<Character> inputStream) {

    repeated = new BitSet(MAX_CHAR);
    inDLL = new Node[MAX_CHAR];
    for (int i = 0; i < MAX_CHAR; i++) {
        inDLL[i] = null;
    }

    inputStream.subscribe(new Action1<Character>() {

        @Override
        public void call(Character incomingCharacter) {
            System.out.println("Received -> " + incomingCharacter);
            processStream(incomingCharacter);   
        }
    });
}

public Character firstNonRepeating() {
    if (head == null) {
        return null;
    }
    return head.item;
}

private void processStream(Character chr) {
    int charValue = (int) chr.charValue();
    if (!repeated.get(charValue)) {

        if (inDLL[charValue] == null) {
            appendToTail(chr);
            inDLL[charValue] = tail;
        } else {
            removeNode(inDLL[charValue]);
            inDLL[charValue] = null;
            repeated.set(charValue);
        }
    }
}

private void removeNode(Node<Character> node) {
    if (head == null) {
        return ;
    } 

    if (head == node) {
        head = head.next;
        //head.prev = null;
    } 
    if (tail == node) {
        tail = tail.prev;
        //tail.next = null;
    }

    if (node.next != null) {
        node.next.prev= node.prev;
    }
    if (node.prev != null) {
        node.prev.next = node.next;
    }
}

private void appendToTail(Character chr) {
    Node<Character> temp = new Node<Character>(chr);
    if (head == null) {
        head = temp;
        tail = temp;
    }
    tail.next = temp;
    temp.prev = tail;
    tail = temp;
}

private static class Node<E> {
    E item;
    Node<E> next;
    Node<E> prev;
    public Node(E val) {
        this.item = val;
    }
}

}

这是单元测试用例

@Test
public void firstNonRepeatingCharInAStreamTest() {

    Observable<Character> observable = Observable.create(new OnSubscribe<Character>() {

        @Override
        public void call(Subscriber<? super Character> subscriber) {
            subscriber.onNext('N');
            subscriber.onNext('Z');
            subscriber.onNext('B');
            subscriber.onNext('C');
            subscriber.onNext('D');
            subscriber.onNext('A');
            subscriber.onNext('C');
            subscriber.onNext('B');
            subscriber.onNext('A');
            subscriber.onNext('N'); 
            //subscriber.onNext('Z');   
        }           
    });

    CharacterStream charStream = new CharacterStream(observable);
    assertThat(charStream.firstNonRepeating(), equalTo('Z'));
  }

答案 4 :(得分:0)

import java.util.*;

class uniqueElem{
    public static HashSet<Integer> hs = new HashSet<Integer>();

    public static void checkElem(){
        Scanner sc = new Scanner(System.in);
        boolean flag = true;
        while(flag == true){
            int no = sc.nextInt();
            if(no == -1) break;

            boolean check = hs.add(no);
            if(check == true){
                System.out.println("Occur first time: "+no);
            }
        }
    }
    public static void main(String args[]){
        checkElem();
    }
}