Haskell - 重命名列表列表中的重复值

时间:2014-12-05 15:33:11

标签: haskell duplicates nested-lists

我有一系列字符串列表,例如;

[["h","e","l","l","o"], ["g","o","o","d"], ["w","o","o","r","l","d"]]

我想重命名子列表外的重复值,以便所有重复都设置为整个子列表中新的随机生成的值,这些值在列表中不存在但在同一子列表中相同,以便可能的结果可能是:

[["h","e","l","l","o"], ["g","t","t","d"], ["w","s","s","r","z","f"]]

我已经有一个可以随机生成一个名为randomStr的大小为1的字符串的函数:

randomStr :: String
randomStr = take 1 $ randomRs ('a','z') $ unsafePerformIO newStdGen

3 个答案:

答案 0 :(得分:1)

假设你想做我在下面的评论中概述的内容,最好将这个问题分解为几个较小的部分,一次解决一个问题。我还建议在basecontainers中利用通用模块,因为它会使代码更简单,更快捷。特别是,模块Data.MapData.Sequence在这种情况下非常有用。 Data.Map我认为这是最有用的,因为它有一些非常有用的功能,否则很难手工编写。正如您所见,Data.Sequence最终用于提高效率。

首先,进口:

import           Data.List      (nub)
import           Data.Map       (Map)
import           Data.Sequence  (Seq, (|>), (<|))
import qualified Data.Map       as Map
import qualified Data.Sequence  as Seq
import           Data.Foldable (toList)
import           System.Random (randomRIO)
import           Control.Monad (forM, foldM)
import           Control.Applicative ((<$>))

Data.Foldable.toList是必需的,因为Data.Sequence没有toList函数,但Foldable提供了一个可行的函数。关于代码。我们首先希望能够获取String的列表并找到其中的所有唯一元素。为此,我们可以使用nub

lettersIn :: [String] -> [String]
lettersIn = nub

我喜欢为这样的函数提供自己的名称,它可以使代码更具可读性。

既然我们可以获得所有独特的字符,我们希望能够为每个字符分配一个随机字符:

makeRandomLetterMap :: [String] -> IO (Map String String)
makeRandomLetterMap letters
    = fmap Map.fromList
    $ forM (lettersIn letters) $ \l -> do
        newL <- randomRIO ('a', 'z')
        return (l, [newL])

在这里,我们获得了一个新的随机字符,并基本上将其与我们的字母列表一起压缩,然后我们fmap<$>Map.fromList覆盖该结果。接下来,我们需要能够使用此映射替换列表中的字母。如果在地图中找不到字母,我们只想回信。幸运的是,Data.Map具有findWithDefault功能,非常适合这种情况:

replaceLetter :: Map String String -> String -> String
replaceLetter m letter = Map.findWithDefault letter letter m

replaceAllLetters :: Map String String -> [String] -> [String]
replaceAllLetters m letters = map (replaceLetter m) letters

由于我们希望能够使用每个子列表中遇到的新字母更新此地图,根据需要覆盖以前遇到的字母,我们可以使用Data.Map.union。由于union支持其第一个参数,我们需要flip

updateLetterMap :: Map String String -> [String] -> IO (Map String String)
updateLetterMap m letters = flip Map.union m <$> makeRandomLetterMap letters

现在我们拥有解决手头问题所需的所有工具:

replaceDuplicatesRandomly :: [[String]] -> IO [[String]]
replaceDuplicatesRandomly [] = return []

对于基本情况,只需返回一个空列表。

replaceDuplicatesRandomly (first:rest) = do
    m <- makeRandomLetterMap first

对于非空列表,请将初始地图从第一个子列表中删除

    (_, seqTail) <- foldM go (m, Seq.empty) rest

折叠其余部分,从空序列和第一个地图开始,然后提取结果序列

    return $ toList $ first <| seqTail

然后在添加第一个子列表之后将序列转换为列表(此函数不会更改它)。 go函数也非常简单:

    where
        go (m, acc) letters = do
            let newLetters = replaceAllLetters m letters
            newM <- updateLetterMap m letters
            return (newM, acc |> newLetters)

当前地图m和当前处理的所有子列表acc以及当前子列表letters的累积,替换所述子列表中的字母,构建新地图对于下一次迭代(newM),然后返回新地图以及处理的所有内容的累积,即acc |> newLetters。总之,功能是

replaceDuplicatesRandomly :: [[String]] -> IO [[String]]
replaceDuplicatesRandomly [] = return []
replaceDuplicatesRandomly (first:rest) = do
    m <- makeRandomLetterMap first
    (_, seqTail) <- foldM go (m, Seq.empty) rest
    return $ toList $ first <| seqTail
    where
        go (m, acc) letters = do
            let newLetters = replaceAllLetters m letters
            newM <- updateLetterMap m letters
            return (newM, acc |> newLetters)

答案 1 :(得分:1)

将不纯净和纯粹的计算分开,总是更好。

你不能用已经在列表中的字母替换,所以你需要获得一串新的字母:

fresh :: [String] -> String
fresh xss = ['a'..'z'] \\ foldr union [] xss

此函数用字符串中的一个字母替换另一个字母:

replaceOne :: Char -> Char -> String -> String
replaceOne y y' = map (\x -> if x == y then y' else x)

此函数每次使用字母列表中每个字符串的新字母替换一个字母:

replaceOnes :: Char -> String -> [String] -> (String, [String])
replaceOnes y = mapAccumL (\(y':ys') xs ->
    if y `elem` xs
       then (ys', replaceOne y y' xs)
       else (y':ys', xs))

例如

replaceOnes 'o' "ijklmn" ["hello", "good", "world"]

返回

("lmn",["helli","gjjd","wkrld"])

有点棘手:

replaceMany :: String -> String -> [String] -> (String, [String])
replaceMany ys' ys xss = runState (foldM (\ys' y -> state $ replaceOnes y ys') ys' ys) xss

对于ys中的每个字符串,此函数每次都会用ys'的新字母替换xss中的每个字母。

例如

replaceMany "mnpqstuvxyz" "lod" ["hello", "good", "world"]

返回

("vxyz",["hemmp","gqqt","wsrnu"])

'l's in "hello" are replaced by the first   letter in "mnpqstuvxyz"
'l'  in "world" is  replaced by the second  letter in "mnpqstuvxyz"
'o'  in "hello" is  replaced by the third   letter in "mnpqstuvxyz"
'o's in "good"  are replaced by the fourth  letter in "mnpqstuvxyz"
...
'd'  in "world" is  replaced by the seventh letter in "mnpqstuvxyz"

这个函数遍历一个字符串列表,并用ys'包含的新字母替换头部的所有字母,用于列表其余部分的每个字符串。

replaceDuplicatesBy :: String -> [String] -> [String]
replaceDuplicatesBy ys'  []      = []
replaceDuplicatesBy ys' (ys:xss) = ys : uncurry replaceDuplicatesBy (replaceMany ys' ys xss)

即。它可以做你想要的,但没有任何随机性 - 只需从列表中选择新的字母。

所有描述的功能都是纯粹的。这是一个不纯洁的人:

replaceDuplicates :: [String] -> IO [String]
replaceDuplicates xss = flip replaceDuplicatesBy xss <$> shuffle (fresh xss)

即。生成包含新字母的字符串的随机排列,并将其传递给replaceDuplicatesBy

您可以从https://www.haskell.org/haskellwiki/Random_shuffle

获取shuffle功能

最后的测试:

main = replicateM_ 3 $ replaceDuplicates ["hello", "good", "world"] >>= print

打印

["hello","gxxd","wcrzy"]
["hello","gyyd","wnrmf"]
["hello","gmmd","wvrtx"]

整个代码(不含shuffle):http://lpaste.net/115763

答案 2 :(得分:-2)

我认为这肯定会引起更多问题,而不是答案。

import Control.Monad.State
import Data.List
import System.Random

mapAccumLM _ s [] = return (s, [])
mapAccumLM f s (x:xs) = do
  (s', y) <- f s x
  (s'', ys) <- mapAccumLM f s' xs
  return (s'', y:ys)

pick excluded for w = do
  a <- pick' excluded
  putStrLn $ "replacement for " ++ show for ++ " in " ++ show w ++ " excluded: " ++ show excluded ++ " = " ++ show a
  return a

-- | XXX -- can loop indefinitely
pick' excluded = do
  a <- randomRIO ('a','z')
  if elem a excluded
    then pick' excluded
    else return a

transform w = do
  globallySeen <- get
  let go locallySeen ch =
        case lookup ch locallySeen of
          Nothing  -> if elem ch globallySeen
                        then do let excluded = globallySeen ++ (map snd locallySeen)
                                a <- lift $ pick excluded ch w
                                return ( (ch, a):locallySeen, a)
                        else return ( (ch,ch):locallySeen,       ch )
          Just ch' -> return (locallySeen, ch')
  (locallySeen, w') <- mapAccumLM go [] w
  let globallySeen' = w' ++ globallySeen
  put globallySeen'
  return w'

doit ws = runStateT (mapM transform ws) []

main = do
 ws' <- doit [ "hello", "good", "world" ]
 print ws'